current position：Home>At 7:30 tonight, senior algorithm engineer of Alibaba Dharma Academy - "semi supervised pre training dialogue model space"
At 7:30 tonight, senior algorithm engineer of Alibaba Dharma Academy - "semi supervised pre training dialogue model space"
2022-05-15 05:17:12【Aitime theory】
Click on the blue words
Pay attention to our
AI TIME Welcome to everyone AI Fans join in ！
5 month 10 Japan 、11 Japan 、12 Friday night 19:30, In this issue, we invite Dai yinpei, senior algorithm engineer of Alibaba Dharma Academy 、 Wang Benyou, a researcher of EU Marie Curie, and Zhang Lifeng, a lecturer at the school of information of Renmin University of China, brought you wonderful sharing ！
5 month 10 Japan 19:30-20:30
Dai yinpei ：
Senior algorithm engineer of Alibaba Dharma Academy , Master graduated from the Department of electronic engineering of Tsinghua University , His research field is natural language processing and dialogue intelligence （Conversational AI）, Specific directions include dialogue and understanding 、 Dialogue management and large-scale pre training dialogue model . stay ACL / AAAI / SIGIR/ ICASSP He has published many papers and served as ACL / EMNLP / NAACL / AAAI Wait for the reviewer of the meeting .
Share content ：
Semi supervised pre training dialogue model SPACE
Introduction to the report ：
How to integrate human prior knowledge into the pre training model at low cost has always been a problem NLP The problem of .
In this work , A new training paradigm based on semi supervised pre training is proposed by the dialogue intelligent team of Dharma Academy , A small amount of marked dialogue data and a large amount of non-standard dialogue data are pre trained through semi supervised method , Using the consistent regularization loss function, the dialogue strategy knowledge contained in the labeled data is injected into the pre training model , So as to learn a better model representation .
A new semi supervised pre training dialogue model SPACE（Semi-Supervised Pre-trAined Conversation ModEl） Firstly, it focuses on the knowledge of dialogue strategies .
Experiments show that ,SPACE1.0 Model in Cambridge MultiWOZ2.0, Amazon MultiWOZ2.1 And other classic conversation data sets 5%+ Significantly improve , And under various low resource settings ,SPACE1.0 Better than existing sota The models have stronger small sample learning ability .
5 month 11 Japan 19:30-20:30
Ben you wang ：
Doctoral student at the University of Padua, Italy , Researcher Marie Curie of the European Union . Will be in 2022 year 6 He joined the Chinese University of Hong Kong as an assistant professor （ Shenzhen ） School of data science . Under the guidance of professors song Dawei and Zhang Peng, he obtained a master's degree from Tianjin University , At the University of Copenhagen in Denmark , University of Montreal, Canada , University of Amsterdam in the Netherlands , Huawei Noah Ark Laboratory , Institute of theoretical physics, Chinese Academy of Sciences , Exchange visit to the Language Institute of the Chinese Academy of social sciences . In industrial applications , He 2017 He worked full-time at Tencent since , As the main algorithm designer , Built a robust intelligent customer service system from zero on Tencent cloud . In a relatively short academic career , He is committed to building more robust and intelligent natural language processing systems , Give consideration to technical rationality and linguistic motivation . So far, he and his collaborators have won the top international information retrieval conference SIGIR 2017 Nomination for best paper and international top conference on natural language processing NAACL 2019 Best interpretable paper , Published, including international top conferences ICLR/SIGIR/WWW/NAACL/AAAI/IJCAI/CIKM etc. 20 Yu Wen .
Share content ：
On position embeddings
Introduction to the report ：
transformer Widely used in nlp Mission （ Especially the pre training model ）, Even started using cv The field is coming to the fore .transformer Its structure will not be able to model the order of input without position coding , So location coding is very important .
At present, many pre training models use different location codes （ For example, fully learnable location coding , Trigonometric function fixed position coding , Relative position coding ）, Good experience , However, there is a lack of a unified framework to understand and evaluate these location codes .
Let's first explain why the motivation of trigonometric function position coding , In short, it is to replace the translation by rotation , To inject position information into the word vector .
Our latest work formalizes some principled properties of location coding （ Translation invariant , Monotonicity and symmetry ）, And evaluate the extent to which the existing location codes meet these attributes , Finally, quantitatively evaluate how these attributes benefit / Damage downstream tasks .
We find that fully learnable location coding works well in whole sentence classification scenarios , Thanks to its flexibility CLS special token And normal position ; The relative position code is in span prediction Better effect .
Student Recruitment Information
Chinese University of Hong Kong ( Shenzhen ) Wang Benyou, School of Data Science / Professor Li Haizhou's team recruited natural language processing / Voice Processing / In the direction of machine learning 3 Full award doctoral students （ Sure 2022FALL, 2022WINTER, 2023 Admission ）,3 Research Assistant ,6 Famous blogger .
The team has strong links between industry and academia , It has rich computing resources , There are enough resources to train the super large pre training language model , Give full play to the scientific research creativity of team members .
Doctoral applications for admission this fall should start as early as June , Both undergraduate and master students can apply for a doctorate , Need IELTS or TOEFL scores （ A foreign degree can be exempted ）, The doctor is awarded a degree certificate issued by the Chinese University of Hong Kong ;RA and post-doc Anytime , Until it's full .
For details, please see https://wabyking.github.io/files/JD4PhD-CUHKSZ.pdf perhaps https://zhuanlan.zhihu.com/p/500582441 .
Details can also be consulted [email protected]
5 month 12 Japan 19:30-20:30
Zhang Lifeng ：
Lecturer, School of information, Renmin University of China . The main research interests include two parts ：1） Research on the methodology of intelligent optimization algorithms such as evolutionary computing , And the application of management operation research algorithm and decision support system in production practice ;2） Research on the theory and methodology of system identification and machine learning , And the application of statistical methods in various fields of data analysis .
Share content ：
Quickly detect the complex correlation between data
Introduction to the report ：
Detecting and distinguishing the relationship between variables is a basic work of data analysis , Quickly finding and measuring the variables with correlation not only saves researchers' time , It also provides valuable direction guidance for subsequent analysis and modeling .
This study proposes a new kind of statistical tools , Nearest neighbor correlation coefficient （nCor）, From a new perspective , Can effectively detect continuous 、 discrete , And categorical variables .
And all kinds of mutual information （MI） Our valuation algorithm 、MIC、dCor、RDC、HSIC Compared with the hot methods in recent years , New methods for various data types 、 Complex relationships are more applicable , Stronger detection ability and robustness .
The new method can also better distinguish predictable 、 Heteroscedastic 、 The interaction of , And all kinds of complex data relationships that overlap , Provide more in-depth and effective guidance for follow-up analysis and research .
This study is based on three papers published in recent years , This paper expounds the principle and specific implementation methods of the new statistics in different application situations .
After the live broadcast, you can ask questions in the group , Please add “AI TIME Little helper （ WeChat ID ：AITIME_HY）”, reply “PhD-4”, Will pull you into “AI TIME PhD Communication group -4”！
AI TIME Wechat assistant
Lord do ：AI TIME
Associated Media ：AI Data pie
partners ： Wisdom spectrum ·AI、 Chinese Academy of Engineering Zhiling live 、 School Online 、 Kou enjoys academic 、AMiner、 Ever Chain action 、 Scientific research cloud 、 An endless stream of Science
Excellent articles in the past are recommended
Remember to pay attention to us ！ There is new knowledge every day ！
About AI TIME
AI TIME From 2019 year , It aims to carry forward the spirit of scientific speculation , Invite people from all walks of life to the theory of artificial intelligence 、 Explore the essence of algorithm and scenario application , Strengthen the collision of ideas , Link the world AI scholars 、 Industry experts and enthusiasts , I hope in the form of debate , Explore the contradiction between artificial intelligence and human future , Explore the future of artificial intelligence .
so far ,AI TIME Has invited 600 Many speakers at home and abroad , Held more than 300 An event , super 150 10000 people watch .
I know you.
Click on Read the original Reservation live broadcast ！
author[Aitime theory],Please bring the original link to reprint, thank you.
The sidebar is recommended
- Online FAQ positioning FAQ what is the cause of the high load problem?
- What is the function of getstatic, a common tool for online FAQs?
- Android 11 new soft keyboard occlusion / animation solution
- Common tools for online FAQs include?
- How does SAP commerce cloud configure new applications for storefront
- In the CMS GC process, what is the reason why the business thread puts objects into the old generation (the characteristics of concurrent collection)?
- How good and accurate is the recommendation?
- Online FAQ positioning FAQs what are the causes of continuous GC problems?
- Does the data reflect the real viewing experience?
- What are the reasons for fullgc (throw oom if FGC recovery is invalid)?
guess what you like
Algorithm improvement - basic algorithm (turtle speed multiplication)
[C + +] sword finger offer 10 - I. Fibonacci sequence
Online FAQ positioning FAQ nosuchmethodexception what is the cause of the problem?
IOS enables native im development
What is the common function of SM?
"Automated testing" a new generation of Web front-end automated testing framework - playwright, get started quickly!
Online FAQ positioning FAQ what is the cause of the high load problem?
What is the function of watch, a common tool for online FAQs?
Timeliness in recommender systems, Zhang Fuguo et al. ESWA 2017
Alibaba's open source Java diagnostic tool uses what methods to diagnose.
- What is the function of dashboard, a common tool for online FAQs?
- What is the role of JAD, a common tool for online FAQs?
- Online FAQ positioning FAQ what are the causes of high CPU utilization?
- 07 - explore the underlying principles of IOS | several OC objects [instance object, class object, metaclass], ISA pointer of object, superclass, method call of object and the underlying essence of class
- Extreme fox gitlab settled in Alibaba cloud computing nest to jointly improve the development experience on the cloud
- How does artificial intelligence help natural science
- Elementui upload file
- Modern CSS solution: CSS mathematical functions
- Create a general efficiency improvement solution for front desk + middle desk based on vue3 (network disk link)
- Brush 100 front-end high-quality interview real questions in 2 weeks, and the source code is complete
- Vue has reduced its workload by half since using components
- I built a front-end mock tool
- About uboot -- Ping virtual machine Ubuntu operation
- Video transcoder editready for Mac
- [taro] taro gets the detailed attributes of the element (solved)
- Picture and text difference comparison tool: kaleidoscope for Mac
- Background of spatiotemporal artificial intelligence
- The top 10 of oceanbase database competition was born, and the integration of industry and education accelerated the training of database talents
- China brand Day ｜ Youxuan software: strengthen its own brand and fight for China's database industry
- New feature release of gaussdb (for redis): enhanced prefix scanning and multi rent isolation
- CICC purchases the original maintenance service of gbase database in 2022
- Java implementation sequence table
- Simple implementation of linked list in Java
- C + + parameterless constructor (difference between stack and heap)
- Vue NPM startup error - solution
- With the introduction of Alibaba cloud database into Shandong medical insurance information platform, the settlement response speed increased by nearly 10 times
- Yixinhuachen was selected into the atlas of relevant manufacturers in the primary market of China's big data industry
- 2021-06-05 Java Foundation (day four): two ways of locking
- Android bangs screen and water drop screen are the best overall adaptation scheme
- Don't take detours in Android learning
- Android realizes Gaode map track playback
- 2021 preparing for the 1000 Java interview questions that gold, silver and four must brush
- The database function queries the MySQL database at the correct time
- Linux changes the SSH connection mode, changes the public key to the user name and password, and logs in
- Websocket + springboot message active push
- Java common classes and methods
- Go connect to the database (simple query learning)
- HTTP - understand HTTP protocol (I)
- Spring boot quickly switches the configuration of development environment and production environment (application. YML)
- Java gets the date of the previous day