Important AI Paper List

25 minute read

Introduciton

In almost all citations it becomes very difficult to read the title of research papers. Why? Because the contributors’ information is first and most of the time, it is difficult to read the name other than native people. For example, if an Indian find a native name like “Vivek Ramaswami, Kartikeyan Karunanidhi” it is easy for them to read the name but the same name becomes difficult to read for non-Indian people, and vice-versa. Giving respect to the creator is very important but more than we need to know what they have done. I know from my experience, for almost every researcher, it becomes very difficult to track good AI research papers. For me, it is more difficult because I need to maintain this blog and I want to give references to the work across different webpages. Therefore I am creating a citation key, which includes the Last name of the first researcher + year of presenting that paper. Along with this, I am describing the title of the paper and where it was presented. If you find a particular title interesting for your work you can search that paper on “google scholar”, Mendeley, sci-hub or other places with which you are familiar and comfortable. Post that you can download and read that paper at your leisure. Hope you find this list of some use for your work.

Citations

Pretrained Language Models for Text Generation: A Survey

[Bahdanau2015]

Neural machine translation by jointly learning to align and translate. In ICLR, 2015.

[Bao2020]

PLATO-2: towards building an open- domain chatbot via curriculum learning. arXiv preprint arXiv:2006.16779, 2020.

[Brown2020]

Language models are few-shot learners. In NeurIPS, 2020.

[Chen2020a]

Distilling knowledge learned in BERT for text generation. In ACL, 2020.

[Chen2020b]

Few-shot NLG with pre-trained language model. In ACL, 2020.

[Conneau2019]

Cross-lingual language model pretraining. In NeurIPS, 2019.

[Devlin2019]

BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.

[Dong2019]

Unified language model pretraining for natural language understanding and generation. In NeurIPS, 2019.

[Fan2019]

Unsupervised pre-training for sequence to sequence speech recognition. CoRR, arXiv preprint arXiv:1910.12418, 2019.

[Gehring2017]

Convolutional sequence to sequence learning. In ICML, 2017.

[Gong2020]

Tablegpt: Few-shot table-to-text generation with table structure reconstruction and content matching. In COLING, 2020.

[Gu2020]

A tailored pre-training model for task-oriented dialog generation. arXiv preprint arXiv:2004.13835, 2020.

[Guan2020]

Survey on automatic text summarization and transformer models applicability. In CCRIS, 2020.

[Hendrycks2020]

Pretrained transformers improve out-of- distribution robustness. In ACL, 2020.

[Keskar2019]

CTRL: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019.

[Kryscinski2018]

Improving abstraction in text summarization. In EMNLP, 2018.

[Lan2020]

ALBERT: A lite BERT for self-supervised learning of language representations. In ICLR, 2020.

[Lewis2020]

BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, 2020.

[Li2019]

Generating long and informative reviews with aspect-aware coarse-to-fine decoding. In ACL, pages 1969–1979, 2019.

[Li2020]

Knowledge-enhanced personalized review generation with capsule graph neural network. In CIKM, pages 735–744, 2020.

[Li2021a]

TextBox: A unified, modularized, and extensible framework for text generation. In ACL, 2021.

[Li2021b]

Few-shot knowledge graph-to-text generation with pretrained language models. In Findings of ACL, 2021.

[Li2021c]

Knowledge-based review generation by coherence enhanced text planning. In SIGIR, 2021.

[Lin2020]

Pretraining multilingual neural machine translation by leveraging alignment information. In EMNLP, 2020.

[Liu2019]

Text summarization with pretrained encoders. In EMNLP, 2019.

[Mager2020]

GPT-too: A language-model-first approach for AMR-to-text generation. In ACL, 2020.

[Peters2018]

Deep contextualized word representations. In NAACL-HLT, 2018.

[Qiu2020]

Pre-trained models for natural language processing: A survey. arXiv preprint arXiv:2003.08271, 2020.

[Radford2019]

Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.

[Raffel2020]

Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 2020.

[Ribeiro2020]

Investigating pretrained language models for graph-to-text generation. arXiv preprint arXiv:2007.08426, 2020.

[Ross, 2012]

Guide for conducting risk assessments. In NIST Special Publication, 2012.

[Rothe2020]

Leveraging pre-trained checkpoints for sequence generation tasks. TACL, 2020.

[Sanh2019]

Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.

[See2017]

Get to the point: Summarization with pointer-generator networks. In ACL, 2017.

[Song2019]

MASS: masked sequence to sequence pre-training for language generation. In ICML, 2019.

[Sun2019a]

Contrastive bidirectional transformer for temporal representation learning. arXiv preprint arXiv:1906.05743, 2019.

[Sun2019b]

Videobert: A joint model for video and language representation learning. In ICCV, 2019.

[Vaswani2017]

Attention is all you need. In NIPS, 2017.

[Wada2018]

Unsupervised cross-lingual word embedding by multilingual neural language models. arXiv preprint arXiv:1809.02306, 2018.

[Wolf2019]

Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149, 2019.

[Xia2020]

XGPT: cross-modal generative pre-training for image captioning. arXiv preprint arXiv:2003.01473, 2020.

[Xu2020a]

Discourse-aware neural extractive text summarization. In ACL, 2020.

[Xu2020b]

Unsupervised extractive summarization by pre-training hierarchical transformers. In EMNLP, 2020.

[Yang2020a]

CSP: code-switching pre-training for neural machine translation. In EMNLP, 2020.

[Yang2020b]

TED: A pretrained unsupervised summarization model with theme modeling and denoising. In EMNLP (Findings), 2020.

[Zaib2020]

A short survey of pre-trained language models for conversational AI-A new age in NLP. In ACSW, 2020.

[Zeng2020]

Generalized conditioned dialogue generation based on pre-trained language model. arXiv preprint arXiv:2010.11140, 2020.

[Zhang2019a]

Pretraining-based natural language generation for text summarization. In CoNLL, 2019.

[Zhang2019b]

HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. In ACL, 2019.

[Zhang2019c]

ERNIE: enhanced language representation with informative entities. In ACL, 2019.

[Zhang2020]

DIALOGPT : Largescale generative pre-training for conversational response generation. In ACL, 2020.

[Zhao2020]

Knowledge-grounded dialogue generation with pretrained language models. In EMNLP, 2020.

[Zheng2019]

Sentence centrality revisited for unsupervised summarization. In ACL, 2019.

[Zhou2020]

Unified vision-language pre-training for image captioning and VQA. In AAAI, 2020

Survey on Automatic Text Summarization and Transformer Models Applicability

[CohanA2018]

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In Proceedings of the 2018 Conference of the North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 615–621.

[NenkovaA2007]

The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Transactions on Speech and Language Processing 4, 2 (2007).

[RadfordA]

Improving language understanding by generative pre-training. www.cs.ubc.ca/~amuham01/LING530/ papers/radford2018improving.pdf

[RasimMA2013]

Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40, 5 (2013), 1675–1689.

[RasimMA]

MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 12 (2011), 14514–14522.

[VaswaniA2017]

Attention is all you need. Advances in neural information processing systems (2017), 5998–6008.

[RaffelC2019]

Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683 (2019).

[BahdanauD2014]

Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014).

[GunesE2004]

LexRank: Graph-based lexical centrality as salience in text summarization. Journal ofArtificial Intelligence 20, 1 (2004), 457–479.

[ZhangH]

Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 23rd Conference on Computational Natural Language Learning (CoNLL). 789–797.

[DevlinJ2019]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings ofthe 2019Conference ofthe NorthAmerican ChapteroftheAssociation forComputational Linguistics: Human Language Technologies. 4171–4186.

[HowardJ]

Universal Language Model Fine-tuning for Text Classification. In Proceedings ofthe 56th Annual Meeting ofthe Association for Computational Linguistics. 328–339.

[ZhangJ2019]

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. arXiv:1912.08777 (2019).

[KaikhahK]

Text summarization using neural networks. In Proceeding of second conference on intelligent system. 40–44.

[XuK]

Show, attend and tell: Neural image caption generation with visual attention. In Proceedings ofthe International conference on machine learning. 2048–2057.

[Chin-YewL]

ROUGE: A package for automatic evaluation of summaries. In Proceedings ofACL Workshop “Text Summarization Branches Out”. 8.

[M2019]

BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv:1910.13461 (2019).

[Ch2011]

A statistical approach for automatic text summarization by extraction. In Proceedings of2011 International Conference on Communication Systems and Network Technologies. 268–271.

[ConroyJM]

Text Summarization via Hidden Markov Models. In Proceedings ofthe 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 406–407.

[PetersM2018]

Deep Contextualized Word Representations. In Proceedings ofthe 2018 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 2227–2237.

[RushAM2015]

A neural attention model for abstractive sentence summarization. arXiv:1509.00685 (2015).

[VinyalsO2015]

Pointer networks. Advances in neural information processing systems (2015), 2692–2700.

[DragomirRR2004]

Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919–938.

[MihalceaR2004]

Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing. 404–411.

[NallapatiR2016]

Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv:1602.06023 (2016).

[OakR2015]

Extractive techniques for automatic document summarization: a survey. International Journal of Innovative Research in Computer and Communication Engineering 4, 3 (2016), 4158–4164.

[ParkerR2011]

English Gigaword. https://catalog.ldc.upenn.edu/LDC2011T07

[ChopraS2016]

Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings ofthe 2016 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 93–98.

[EvanS2008]

The New York Times Annotated Corpus. https://catalog.ldc.upenn. edu/LDC2008T19

[EdunovS2019]

Pre-trained language model representations for language generation. In Proceedings ofthe 2019 Conference ofthe North American Chapter ofthe Association for Computational Linguistics. 4052–4059.

[NarayanS2018]

Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 2018 Conference on Empirical Methods in Natural Language Processing. 1797–1807.

[PeterJ2017]

Get to the point: Summarization with pointer-generator networks. arXiv:1704.04368 (2017).

[GuptaV2010]

A Survey of Text Summarization Extractive Techniques. Journal ofEmerging Technologies in Web Intelligence 2, 3 (2010), 258–268.

[SanhV2019]

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arxiv.org/pdf/1910.01108 (2019).

[LiuY2019]

Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692 (2019).

[YanY2020]

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training. arXiv:2001.04063 (2020).

[DaiZ]

Transformer- XL: Attentive Language Models beyond a Fixed-Length Context. In Proceedings ofthe 57th Annual Meeting ofthe Association for Computational Linguistics. 2978–2988.

[LanZ2019]

Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019).

[YangZ2019]

Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems (2019), 5754–5764.

CTRL: A Conditional Transformer Language Model For Controllable Generation

[Mart2016]

Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Imple-mentation ({OSDI} 16), pp. 265–283, 2016.

[Rohan2019]

Memory-efficient adaptive optimiza-tion for large-scale learning. arXiv preprint arXiv:1901.11150, 2019.

[Martin2017]

Wasserstein generative adversarial networks. ´In International conference on machine learning, pp. 214–223, 2017.

[Matthew2017]

Factsheets: Increasing trust in AI servicesthrough supplier’s declarations of conformity, August 2018. arXiv:1808.07261 [cs.CY].Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. Unsupervised neural machinetranslation. arXiv preprint arXiv:1710.11041, 2017.

[Jimmy2016]

Layer normalization. CoRR, abs/1607.06450,2016.

[Lo2019]

Findings of the2019 conference on machine translation (wmt19). In Proceedings of the Fourth Conference onMachine Translation (Volume 2: Shared Task Papers, Day 1), pp. 1–61, 2019.

[Yoshua2003]

A neural probabilistic ´language model. Journal of machine learning research, 3(Feb):1137–1155, 2003.

[Thorsten2007]

Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Methods inNatural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL),pp. 858–867, 2007.

[Miles2016]

Artificial intelligence and responsible innovation. In Vincent C. Muller (ed.), ¨Fundamental Issues of Artificial Intelligence, pp. 543–554. Springer, 2016.

[Miles2019]

The malicious use of artificial intelligence: Forecasting,prevention, and mitigation, February 2019. arXiv:1802.07228 [cs.AI].Isaac Caswell, Ciprian Chelba, and David Grangier. Tagged back-translation. arXiv preprintarXiv:1906.06442, 2019.

[Xi2016]

Infogan:Interpretable representation learning by information maximizing generative adversarial nets. InAdvances in neural information processing systems, pp. 2172–2180, 2016.

[Rewon2019]

Generating long sequences with sparsetransformers. arXiv preprint arXiv:1904.10509, 2019.

[Ronan2008]

A unified architecture for natural language processing: Deepneural networks with multitask learning. In Proceedings of the 25th international conference onMachine learning, pp. 160–167. ACM, 2008.

[Ronan2011]

Natural language processing (almost) from scratch. Journal of machine learning research,12(Aug):2493–2537, 2011.

[Ruth1987]

The consumption junction: A proposal for research strategies in the sociol-ogy of technology. In Wiebe E. Bijker, Thomas P. Hughes, and Trevor J. Pinch (eds.), The SocialConstruction of Technological Systems, pp. 261–280. MIT Press, Cambridge, MA, USA, 1987.

[Andrew2015]

Semi-supervised sequence learning. In Advances in neural infor-mation processing systems, pp. 3079–3087, 2015.

[Zihang2019]

Transformer-xl: Attentive language models beyond a fixed-length context. arXivpreprint arXiv:1901.02860, 2019.

[Jacob2018]

Bert: Pre-training of deepbidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

[John2011]

Adaptive subgradient methods for online learning andstochastic optimization. Journal of Machine Learning Research, 12(Jul):2121–2159, 2011.

[Matthew2017]

Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprintarXiv:1704.05179, 2017.

[Angela2018]

Hierarchical neural story generation. arXiv preprintarXiv:1805.04833, 2018.

[Angela2019]

Eli5:Long form question answering. arXiv preprint arXiv:1907.09190, 2019.

[Boris2019]

Stochastic gradient methods with layer-wise adaptive moments for training of deep networks. arXiv preprint arXiv:1905.11286, 2019.

[Ian2014]

Generative adversarial nets. In Advances in neural infor-mation processing systems, pp. 2672–2680, 2014.

[Max2016]

Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. In Proceedings of the 2018 Conference of the North AmericanChapter of the Association for Computational Linguistics: Human Language Technologies, pp.708–719, New Orleans, Louisiana, June 2018.

[Kaiming2016]

Deep residual learning for image recog-nition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770–778, 2016.

[Karl2015]

Teaching machines to read and comprehend. In Advances inneural information processing systems, pp. 1693–1701, 2015.

[Ari2019]

The curious case of neural text degener-ation. arXiv preprint arXiv:1904.09751, 2019.

[Jeremy2018]

Universal language model fine-tuning for text classification.arXiv preprint arXiv:1801.06146, 2018.

[Hakan2016]

Tying word vectors and word classifiers: Aloss framework for language modeling. arXiv preprint arXiv:1611.01462, 2016.

[Melvin2017]

Googles multilingual neural ´machine translation system: Enabling zero-shot translation. Transactions of the Association forComputational Linguistics, 5:339–351, 2017.

[Mandar2017]

Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551, 2017.

[David2017]

Self-censorship is not enough. Nature, 492(7429):345–347,December 2012. doi: 10.1038/492345a.

[Lukasz2017]

One model to learn them all. arXiv preprint arXiv:1706.05137, 2017.

[Łukasz2018]

Fast decoding in sequence models using discrete latent variables. arXiv preprintarXiv:1803.03382, 2018.

[Nitish2019]

Unifying questionanswering and text classification via span extraction. arXiv preprint arXiv:1904.09286, 2019.

[Diederik2014]

Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980, 2014.

[Diederik2013]

Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114, 2013.

[Ryan2015]

Skip-thought vectors. In Advances in neural information processingsystems, pp. 3294–3302, 2015.

[Catherine2016]

Senellart. Domain control for neural machine translation.arXiv preprint arXiv:1612.06140, 2016.

[Wojciech2019]

Neural text summarization: A critical evaluation. arXiv preprint arXiv:1908.08960, 2019.

[Tom2019]

Natural questions: abenchmark for question answering research. Transactions of the Association for ComputationalLinguistics, 7:453–466, 2019.

[Guillaume2019]

Cross-lingual language model pretraining. arXiv preprintarXiv:1901.07291, 2019.

[Guillaume2019]

Large memory layers with product keys. ´ arXiv preprint arXiv:1907.05242, 2019.

[Hector2012]

The winograd schema challenge. In Thir-teenth International Conference on the Principles of Knowledge Representation and Reasoning,2012.

[Patrick2019]

Unsupervised question answering by clozetranslation. arXiv preprint arXiv:1906.04980, 2019.

[Minh-Thang2015]

Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114, 2015.

[Julian2015]

Image-based rec-ommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIRConference on Research and Development in Information Retrieval, pp. 43–52. ACM, 2015.

[Bryan6294]

Learned in translation:Contextualized word vectors. In Advances in Neural Information Processing Systems, pp. 6294.

[Bryan2018]

The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730, 2018.

[15Stephen2017]

Regularizing and optimizing lstm lan-guage models. arXiv preprint arXiv:1708.02182, 2017.

[Tomas2013]

Distributed represen-tations of words and phrases and their compositionality. In Advances in neural information pro-cessing systems, pp. 3111–3119, 2013.

[Margaret7596]

Model cards for model reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency (FAT* ’19), Jan-uary 2019. doi: 10.1145/3287560.3287596.

[Amit2019]

Filling gender & number gaps in neural ma-chine translation with black-box context injection. arXiv preprint arXiv:1903.03467, 2019.

[Vinod2010]

Rectified linear units improve restricted boltzmann machines. InProceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814,2010.

[Ramesh2016]

Abstractive text summarizationusing sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.

[Matthew2018]

Deep contextualized word representations. arXiv preprint arXiv:1802.05365,2018.

[Carol1979]

Constraints on language mixing: intra sentential code-switching and borrowing inspanish/english. Language, pp. 291–318, 1979.

[Shana1980]

Sometimes ill start a sentence in spanish y termino en espanol: toward a typologyof code-switching1. Linguistics, 18(7-8):581–618, 1980.

[Ofir2016]

Using the output embedding to improve language models. arXiv preprintarXiv:1608.05859, 2016.

[Alec2018]

Improving language under-standing by generative pre-training. URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/language understanding paper.pdf, 2018.

[Alec2019]

Language models are unsupervised multitask learners. URLhttps://d4mucfpksywv.cloudfront.net/better-language-models/language models are unsupervised multitask learners.pdf, 2019.

[Nazneen2019]

Explain yourself! leveraging language models for commonsense reasoning. arXiv preprint arXiv:1906.02361, 2019.

[Pranav2016]

Squad: 100,000+ questionsfor machine comprehension of text. arXiv preprint arXiv:1606.05250, 2016.

[Alexander2015]

A neural attention model for abstractivesentence summarization. arXiv preprint arXiv:1509.00685, 2015.

[Evan2008]

The new york times annotated corpus. Linguistic Data Consortium, Philadelphia,6(12):e26752, 2008.

[Thomas2019]

Answers unite!unsupervised metrics for reinforced summarization models. arXiv preprint arXiv:1909.01610,2019.

[Abigail2017]

Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics (Volume 1: Long Papers), volume 1, pp. 1073–1083, 2017.

[Rico2015]

Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909, 2015.

[Noam2018]

Adafactor: Adaptive learning rates with sublinear memory cost.arXiv preprint arXiv:1804.04235, 2018.

[Jack008]

Developing a framework for responsible inno-vation. Research Policy, 42(9):1568–1580, November 2013. doi: 10.1016/j.respol.2013.05.008.

[Ilya2014]

Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112, 2014.

[Trieu2018]

A simple method for commonsense reasoning. arXiv preprintarXiv:1806.02847, 2018.

[Adam2016]

A machine comprehension dataset. arXiv preprint arXiv:1611.09830,2016.

[Lav6008]

Pretrained AI models: Performativity,mobility, and change, September 2019. arXiv:1909.03290 [cs.CY].

[Curran2018]

Glue:A multi-task benchmark and analysis platform for natural language understanding. arXiv preprintarXiv:1804.07461, 2018.

[Sean2019]

Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319, 2019.

[Yonghui2016]

Google’s neural machine trans-lation system: Bridging the gap between human and machine translation. arXiv preprintarXiv:1609.08144, 2016.

[Stratos2019]

Sumqe: a bert-based summary quality estimation model. arXiv preprint arXiv:1909.00578, 2019.

[Zhilin2018]

Hotpotqa: A dataset for diverse, explainable multi-hop questionanswering. arXiv preprint arXiv:1809.09600, 2018.

[Rowan 2019]

Defending against neural fake news. arXiv preprint arXiv:1905.12616, 2019

[Fangxiaoyu2022]

Language-agnostic BERT Sentence Embedding - LaBSE

BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning
BERT based cross-lingual sentence embeddings is explored in this paper.
It explored combining the best methods for learning monolingual and cross-lingual representations including: masked language modeling (MLM), translation language modeling (TLM)
Introducing a pre-trained multilingual language model dramatically reduces the amount of parallel training data required to achieve good performance
It produces a model that achieves high bi-text retrieval accuracy over 112 languages

NLP Papers Available on my Google Drive

You can download these papers from link

A brief introduction to boosting.pdf
A Closer Look at Fermentors and Bioreactors.pdf
A Comprehensive Survey on Graph Neural Networks.pdf
A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection.pdf
A dataset for detecting irony in Hindi-english code-mixed social media text.pdf
A Framework for Document Specific Error Detection and Corrections in Indic OCR.pdf
A lexicon-based approach for hate speech detection.pdf
A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (.pdf
A novel automatic satire and irony detection using ensembled feature selection and data mining.pdf
A Pragmatic Analysis Of Humor In Modern Family.pdf
A Selective Overview of Deep Learning.pdf
A Sentiment Analyzer for Hindi Using Hindi Senti Lexicon.pdf
A Survey of Code-switched Speech and Language Processing.pdf
A Survey of the State of Explainable AI for Natural Language Processing.pdf
A Survey on Explainable Artificial Intelligence (XAI) Toward Medical XAI.pdf
A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language.pdf
A transformer-based approach to irony and sarcasm detection.pdf
A2Text-net A novel deep neural network for sarcasm detection.pdf
Adaptive glove and fasttext model for Hindi word embeddings.pdf
AI and Ethics - Operationalising Responsible AI-PAPER.pdf
AI4Bharat-IndicNLP Corpus Monolingual Corpora and Word Embeddings for Indic Languages.pdf
ALBERT A Lite BERT for Self-supervised Learning of Language Representations.pdf
an Analysis of Current Trends for Sanskrit As a Computer Programming Language.pdf
An empirical, quantitative analysis of the differences between sarcasm and Irony.pdf
An Image is Worth 16x16 Words Transformers for Image Recognition at Scale.pdf
Analyzing_The_Expressive_Power_Of_Graph.pdf
AnnCorra Annotating Corpora Guidelines For POS And Chunk Annotation For Indian Languages.pdf
Approaches to Cross-Domain Sentiment Analysis A Systematic Literature Review.pdf
Attention is all you need.pdf
Automatic sarcasm detection A survey.pdf
Automatic satire detection Are you having a laugh.pdf
Bag of tricks for efficient text classification.pdf
Baselines and bigrams Simple, good sentiment and topic classification.pdf
BERT Explained - A list of Frequently Asked Questions.pdf
BERT Pre-training of deep bidirectional transformers for language understanding.pdf
BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories.pdf
Carer Contextualized affect representations for emotion recognition.pdf
CASCADE Contextual Sarcasm Detection in Online Discussion Forums.pdf
Challenges in Deploying Machine Learning a Survey of Case Studies.pdf
Clinical artificial intelligence quality improvement towards continual monitoring and updating of AI algorithms in healthcare.pdf
CLUE based load balancing in replicated web server.pdf
Clues for detecting irony in user-generated contents Oh…!! it_s so easy -).pdf
Code Mixing A Challenge for Language Identification in the Language of Social Media.pdf
Context-based Sarcasm Detection in Hindi Tweets.pdf
Contextualized sarcasm detection on twitter.pdf
Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis.pdf
Data governance A conceptual framework, structured review, and research agenda.pdf
Deep and Dense Sarcasm Detection.pdf
Deep learning based unsupervised POS tagging for Sanskrit.pdf
Detailed human avatars from monocular video.pdf
Detecting Sarcasm is Extremely Easy -).pdf
DIALOGPT Large-Scale Generative Pre-training for Conversational Response Generation.pdf
DistilBERT, a distilled version of BERT smaller, faster, cheaper and lighter.pdf
DRIFT Deep Reinforcement Learning for Functional Software Testing.pdf
Drop A reading comprehension benchmark requiring discrete reasoning over paragraphs.pdf
Dynamic routing between capsules.pdf
Effect of speech coding on speaker identification.pdf
Efficient estimation of word representations in vector space(2).pdf
ELECTRA Pre-training Text Encoders as Discriminators Rather Than Generators.pdf
Embedding Words as Distributions with a Bayesian Skip-gram Model.pdf
Enriching Word Vectors with Subword Information.pdf
Experience Grounds Language.pdf
Exploiting emojis for sarcasm detection.pdf
Exploiting Similarities among Languages for Machine Translation.pdf
Exploring the fine-grained analysis and automatic detection of irony on Twitter(2).pdf
Exploring the fine-grained analysis and automatic detection of irony on Twitter.pdf
Exploring the impact of pragmatic phenomena on irony detection in tweets A multilingual corpus study.pdf
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.pdf
Extensions to HMM-based Statistical Word Alignment Models.pdf
Fairness_In_Machine_Learning_A_Survey.pdf
Fake news detection of Indian and United States election data using machine learning algorithm.pdf
Fake News Detection on Social Media.pdf
FakeNewsNet A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social.pdf
Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks.pdf
FastText.zip Compressing text classification models.pdf
Figurative messages and affect in Twitter Differences between #irony, #sarcasm and #not.pdf
Forecasting COVID-19 Confirmed Cases in Major Indian Cities and Their Connectedness with Mobility and Weather-related Parameters.pdf
From English To Foreign Languages Transferring Pre-trained Language Models.pdf
FROM Pre-trained Word Embeddings TO Pre-trained Language Models - Focus on BERT.pdf
Going deeper with convolutions.pdf
Graph Machine Learning NeurIPS 2020 Papers.pdf
Grouped Convolutional Neural Networks for Multivariate Time Series.pdf
Grouped Functional Time Series Forecasting An Application to Age-Specific Mortality Rates.pdf
Handbook of approximation algorithms and metaheuristics.pdf
Harnessing context incongruity for sarcasm detection.pdf
Harnessing Online News for Sarcasm Detection in Hindi Tweets.pdf
Hidden Markov Models.pdf
Hidden technical debt in machine learning systems.pdf
Hotpotqa A dataset for diverse, explainable multi-hop question answering.pdf
How multilingual is multilingual BERT.pdf
How to avoid machine learning pitfalls a guide for academic researchers.pdf
How to read a paper.pdf
HuggingFace_s Transformers State-of-the-art Natural Language Processing.pdf
Identifying machine learning techniques for classification of target advertising.pdf
Identifying sarcasm in Twitter A closer look.pdf
Improving Language Understanding by Generative Pre-Training.pdf
Improving the learnability of classifiers for Sanskrit OCR corrections.pdf
Indic sentiReview Natural language processing based sentiment analysis on major indian languages.pdf
Interactive-and-Visual-Prompt-Engineering-for-adhoc-Task-Adaptation-LLM.pdf
Investigations in computational sarcasm.pdf
Irony detection in twitter The role of affective content.pdf
Irony, Sarcasm and Parody in the American Sitcom Modern Family.pdf
iSarcasm A Dataset of Intended Sarcasm.pdf
K-means with Three different Distance Metrics.pdf
Knowledge Representation in Sanskrit and Artificial Intelligence.pdf
Learning Graph Search Heuristics.pdf
Learning latent causal graphs via mixture oracles.pdf
LearningSys_2015_paper_32.pdf
Lexicon-Based Methods for Sentiment Analysis.pdf
Lexicon-Based Sentiment Analysis in the Social Web.pdf
LightGBM A highly efficient gradient boosting decision tree.pdf
Linguistic Inquiry and Word Count LIWC2015.pdf
Machine Learning in Automated Text Categorization.pdf
Machine Learning within a Graph Database A Case Study on Link Prediction for Scholarly Data.pdf
Machine Translation Approaches and Survey for Indian Languages.pdf
Machine Translation of Bi-lingual Hindi-English (Hinglish) Text.pdf
Merlion A Machine Learning Library for Time Series.pdf
Mining of Massive Datasets.pdf
MLP-Mixer An all-MLP Architecture for Vision.pdf
Multi-modal sarcasm detection in Twitter with hierarchical fusion model.pdf
Multi-rule based ensemble feature selection model for sarcasm type detection in Twitter.pdf
Multimodal markers of irony and sarcasm.pdf
N Atural L Anguage I Nference Over.pdf
Natural Language Processing - A Panian Perspective.pdf
Natural language processing based features for sarcasm detection An investigation using bilingual social media texts.pdf
NeuralProphet Explainable Forecasting at Scale.pdf
On State-of-the-art of POS Tagger, Sandhi Splitter, Alankaar Finder and Samaas Finder for IndoAryan and Dravidian Languages.pdf
Opinion mining and sentiment analysis.pdf
Opinion-Based Entity Ranking (Author_s Draft).pdf
Part-of-speech tagging from 97_ to 100_ Is it time for some linguistics.pdf
PAVE Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models.pdf
Real-time Sentiment Analysis of Hindi Tweets.pdf
Reasoning with sarcasm by reading in-between.pdf
Recent trends in deep learning based natural language processing Review Article.pdf
RECEPTIVE FIELDS OF SINGLE NEURONES IN THE CAT _ S STRIATE CORTEX.pdf
Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing.pdf
Representing social media users for sarcasm detection.pdf
Retrospective Reader for Machine Reading Comprehension.pdf
RoBERTa A Robustly Optimized BERT Pretraining Approach.pdf
Robotics , AI , and.pdf
ROC graphs Notes and practical considerations for researchers.pdf
Sanskrit sandhi splitting using Seq2(Seq)22.pdf
Sanskrit word segmentation using character-level recurrent and convolutional neural networks.pdf
Sarc-M Sarcasm Detection in Typo-graphic Memes.pdf
Sarcasm as contrast between a positive sentiment and negative situation.pdf
Sarcasm Detection in Hindi sentences using Support Vector machine.pdf
Sarcasm detection in tweets.pdf
Sarcasm detection on twitterA behavioral modeling approach.pdf
Sarcastic sentiment detection in tweets streamed in real time a big data approach.pdf
Scalable linear algebra on a relational database system.pdf
Scaling Large Production Clusters with Partitioned Synchronization This paper is included in the Proceedings of the.pdf
Semantics-Aware BERT for Language Understanding.pdf
Semi-supervised recognition of sarcastic sentences in twitter and Amazon.pdf
SentencePiece A simple and language independent subword tokenizer and detokenizer for neural text processing.pdf
Sentiment Analysis for Hindi Language.pdf
Sentiment Analysis in a Resource Scarce LanguageHindi.pdf
Sentiment Analysis In Hindi.pdf
Sentiment Analysis in Indian languages o Definition.pdf
Sentiment Analysis of Hindi Review based on Negation and Discourse Relation.pdf
Sentiment classification using machine learning techniques with syntax features.pdf
Skillful writing of an awful research paper.pdf
Social media and fake news in the 2016 election.pdf
Sound classification using convolutional neural network and tensor deep stacking network.pdf
Sparse, contextually informed models for irony detection Exploiting user communities, entities and sentiment.pdf
SQuad 100,000 questions for machine comprehension of text.pdf
ST4_Method_Random_Forest.pdf
Statistical Methods in Natural Language Processing.pdf
StructBERT Incorporating Language Structures into Pre-training for Deep Language Understanding.pdf
Structural S tudies on S mall A myloid O ligomers RT-6.pdf
Superintelligence.pdf
Systematic literature review of sentiment analysis on Twitter using soft computing techniques.pdf
Text categorization with support vector machines Learning with many relevant features.pdf
Text normalization of code mix and sentiment analysis.pdf
The Differential Role of Ridicule in Sarcasm and Irony The Differential Role of Ridicule in Sarcasm and Irony.pdf
The highest form of intelligence Sarcasm increases creativity for both expressers and recipients.pdf
The Modern Mathematics of Deep Learning *.pdf
The Paninian approach to natural language processing.pdf
The perfect solution for detecting sarcasm in tweets #not.pdf
Thumbs Up or Thumbs Down Semantic Orientation Applied to Unsupervised Classification of Reviews.pdf
THU_NGN at SemEval-2018 Task 3 Tweet Irony Detection with Densely connected LSTM and Multi-task Learning.pdf
TnT - A Statistical Part-of-Speech Tagger.pdf
To BLOB or Not To BLOB Large Object Storage in a Database or a Filesystem To BLOB or Not To BLOB Large Object Storage in a Dat.pdf
Towards Demystifying Serverless Machine Learning Training.pdf
Towards multimodal sarcasm detection (an obviously perfect paper).pdf
Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text.pdf
Triple-View Feature Learning for Medical Image Segmentation.pdf
Twitter as a corpus for sentiment analysis and opinion mining.pdf
Two improved continuous bag-of-word models.pdf
Understanding Diffusion Models A Unified Perspective Introduction Generative Models.pdf
Universal Sentence Encoder.pdf
Unsupervised Irony Detection A Probabilistic Model with Word Embeddings.pdf
UR-Funny A multimodal language dataset for understanding humor.pdf
Use of Sanskrit for natural language processing.pdf
Using TF-IDF to Determine Word Relevance in Document Queries.pdf
Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval.pdf
Very deep convolutional networks for large-scale image recognition.pdf
We are IntechOpen , the world ‘ s leading publisher of Open Access books Built by scientists , for scientists TOP 1 _.pdf
When BERT Plays the Lottery, All Tickets Are Winning.pdf
XGBoost A scalable tree boosting system.pdf
XLNet Generalized Autoregressive Pretraining for Language Understanding.pdf

AI Papers Available on my Google Drive