東華大學圖書館 |

Inner Product Matrix Algorithms for Transfer and Adaptation of Language Representations.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Inner Product Matrix Algorithms for Transfer and Adaptation of Language Representations./
作者:	Sachidananda, Vinayak.
面頁冊數:	1 online resource (131 pages)
附註:	Source: Dissertations Abstracts International, Volume: 84-09, Section: B.
Contained By:	Dissertations Abstracts International84-09B.
標題:	Adaptation. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30306126click for full text (PQDT)
ISBN:	9798374473261

Inner Product Matrix Algorithms for Transfer and Adaptation of Language Representations.
Sachidananda, Vinayak.

Inner Product Matrix Algorithms for Transfer and Adaptation of Language Representations. - 1 online resource (131 pages)

Source: Dissertations Abstracts International, Volume: 84-09, Section: B.

Thesis (Ph.D.)--Stanford University, 2022.

Includes bibliographical references

Language is our primary mode of communication. We use language to speak to one another, represent knowledge, interact with and program computers, and express ourselves. Resultantly, for many decades teaching computers to understand language has been a cornerstone task in Artificial Intelligence. The fundamental challenge in this field has been one of representation - how do we convert language, an arbitrary composition of discrete symbols, into the mathematical structures, vectors and matrices, with which Artificial Intelligence systems operate? Recent breakthroughs in learning this mapping, known as representation learning, have resulted in models which have achieved human parity on a number of language tasks - translation, question answering and text generation to name a few. Furthermore, these models have influenced and unified perceptual tasks in the fields of vision, acoustics, and decision making - advancing our ability to build intelligent systems.In this work, we design and analyze algorithms used for adapting language representations - the mapping between language as we perceive it to the vectors and matrices that computers can understand. How do we teach computers how to convert words, sentences, and documents to numerical form? Naturally, one may think that a good conversion of language to numbers would preserve similarity. Similar language such as "hotel" and "inn" and dissimilar language such as "good" and "bad" should be represented using close by and far away numbers respectively. The field of language representation learning seeks to answer these questions by encoding language as vectors and matrices. In this setting, the notion of similarity and dissimilarity is mathematically represented by inner products between vectors. Two recent neural network based models for learning representations of language, word embeddings and transformers, have led to breakthroughs by encoding these similarities and dissimilarities using unstructured large text corpora from the Internet. However, some fundamental challenges remain. In this work, we develop algorithms which allow computational models to adapt language representations to different domains, languages, and modalities - a line of work formally known as domain adaptation and transfer learning. This enables a single Artificial Intelligence model to understand legal reports, medical documents, financial due diligence, text from various languages, and even programming code. The unifying theme for the algorithms discussed will be that they operate on the inner products between language inputs - the source of encoded information in these representations.The initial works in this thesis will focus on domain adaptation which allows models to understand language that is specific to individual communities - such as that used by engineers, doctors, or lawyers. The first algorithm we discuss shows a metric equivalence between the Frobenius norm of the gram matrices of two sets of representations and the residual of the Orthogonal Procrustes problem. The former metric, the Global Anchor Method, is more general as it can be applied agnostic of dimensionalities. We highlight the benefits of this algorithm in adapting a conversational agent to perform diagnostics and troubleshooting in the networking domain. Next, we describe methods for domain adaptation of transformer language model tokenizers and highlight applications for domain specific text classification.The latter two works in this thesis detail recent approaches in transfer learning for connecting, or aligning, representations from different types of inputs such as text from different languages or correspondences between natural and programming languages. The first work in this line of research will discuss algorithms for mapping sets of representations to a common inner product space. We propose a supervised algorithm, Filtered Inner Product Projection, which operates on pairwise inner products and aligns a source and target embedding on common pieces of information. This approach is shown to provide state-of-the-art performance in word translation retrieval tasks. Lastly, I will discuss work which utilizes inner product matrices to assign batches in contrastive learning - a scalable framework which is commonly used to connect representations trained on different types of inputs such as modalities. In this work, an upper bound on the gap between the total and observed losses in standard contrastive learning settings can be relaxed to a Matrix Bandwidth Minimization problem. An efficient algorithm using bandwidth minimization heuristics, Tail Batch Sampling, is then designed, shown to reduce the gap between total and observed contrastive losses, and obtains state-of-the-art results in both sentence embedding and code search tasks.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023

Mode of access: World Wide Web

ISBN: 9798374473261Subjects--Topical Terms:

3562958
Adaptation.
Index Terms--Genre/Form:

542853
Electronic books.

Inner Product Matrix Algorithms for Transfer and Adaptation of Language Representations.
LDR:06122nmm a2200361K 4500 001 2362215
005 20231027103340.5
006 m o d
007 cr mn ---uuuuu
008 241011s2022 xx obm 000 0 eng d
020 $a 9798374473261
035 $a (MiAaPQ)AAI30306126
035 $a (MiAaPQ)STANFORDxj717fb0201
035 $a AAI30306126
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Sachidananda, Vinayak. $3 3702935
245 1 0 $a Inner Product Matrix Algorithms for Transfer and Adaptation of Language Representations.
264 0 $c 2022
300 $a 1 online resource (131 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 84-09, Section: B.
500 $a Advisor: Prabhakar, Balaji;Zhu, Chenguang.
502 $a Thesis (Ph.D.)--Stanford University, 2022.
504 $a Includes bibliographical references
520 $a Language is our primary mode of communication. We use language to speak to one another, represent knowledge, interact with and program computers, and express ourselves. Resultantly, for many decades teaching computers to understand language has been a cornerstone task in Artificial Intelligence. The fundamental challenge in this field has been one of representation - how do we convert language, an arbitrary composition of discrete symbols, into the mathematical structures, vectors and matrices, with which Artificial Intelligence systems operate? Recent breakthroughs in learning this mapping, known as representation learning, have resulted in models which have achieved human parity on a number of language tasks - translation, question answering and text generation to name a few. Furthermore, these models have influenced and unified perceptual tasks in the fields of vision, acoustics, and decision making - advancing our ability to build intelligent systems.In this work, we design and analyze algorithms used for adapting language representations - the mapping between language as we perceive it to the vectors and matrices that computers can understand. How do we teach computers how to convert words, sentences, and documents to numerical form? Naturally, one may think that a good conversion of language to numbers would preserve similarity. Similar language such as "hotel" and "inn" and dissimilar language such as "good" and "bad" should be represented using close by and far away numbers respectively. The field of language representation learning seeks to answer these questions by encoding language as vectors and matrices. In this setting, the notion of similarity and dissimilarity is mathematically represented by inner products between vectors. Two recent neural network based models for learning representations of language, word embeddings and transformers, have led to breakthroughs by encoding these similarities and dissimilarities using unstructured large text corpora from the Internet. However, some fundamental challenges remain. In this work, we develop algorithms which allow computational models to adapt language representations to different domains, languages, and modalities - a line of work formally known as domain adaptation and transfer learning. This enables a single Artificial Intelligence model to understand legal reports, medical documents, financial due diligence, text from various languages, and even programming code. The unifying theme for the algorithms discussed will be that they operate on the inner products between language inputs - the source of encoded information in these representations.The initial works in this thesis will focus on domain adaptation which allows models to understand language that is specific to individual communities - such as that used by engineers, doctors, or lawyers. The first algorithm we discuss shows a metric equivalence between the Frobenius norm of the gram matrices of two sets of representations and the residual of the Orthogonal Procrustes problem. The former metric, the Global Anchor Method, is more general as it can be applied agnostic of dimensionalities. We highlight the benefits of this algorithm in adapting a conversational agent to perform diagnostics and troubleshooting in the networking domain. Next, we describe methods for domain adaptation of transformer language model tokenizers and highlight applications for domain specific text classification.The latter two works in this thesis detail recent approaches in transfer learning for connecting, or aligning, representations from different types of inputs such as text from different languages or correspondences between natural and programming languages. The first work in this line of research will discuss algorithms for mapping sets of representations to a common inner product space. We propose a supervised algorithm, Filtered Inner Product Projection, which operates on pairwise inner products and aligns a source and target embedding on common pieces of information. This approach is shown to provide state-of-the-art performance in word translation retrieval tasks. Lastly, I will discuss work which utilizes inner product matrices to assign batches in contrastive learning - a scalable framework which is commonly used to connect representations trained on different types of inputs such as modalities. In this work, an upper bound on the gap between the total and observed losses in standard contrastive learning settings can be relaxed to a Matrix Bandwidth Minimization problem. An efficient algorithm using bandwidth minimization heuristics, Tail Batch Sampling, is then designed, shown to reduce the gap between total and observed contrastive losses, and obtains state-of-the-art results in both sentence embedding and code search tasks.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2023
538 $a Mode of access: World Wide Web
650 4 $a Adaptation. $3 3562958
650 4 $a Language. $3 643551
650 4 $a Ablation. $3 3562462
650 4 $a Bilingualism. $3 730672
650 4 $a Optimization. $3 891104
650 4 $a Natural language. $3 3562052
650 4 $a Bilingual education. $3 2122778
650 4 $a Education. $3 516579
655 7 $a Electronic books. $2 lcsh $3 542853
690 $a 0679
690 $a 0800
690 $a 0282
690 $a 0515
710 2 $a ProQuest Information and Learning Co. $3 783688
710 2 $a Stanford University. $3 754827
773 0 $t Dissertations Abstracts International $g 84-09B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30306126 $z click for full text (PQDT)