機器翻譯,又稱為自動翻譯,是利用計算機將一種自然語言(源語言)轉換為另一種自然語言(目標語言)的過程。它是計算語言學的一個分支,是人工智能的終極目標之一,具有重要的科學研究價值。

知識薈萃

機器翻譯 Machine Translation 專知薈萃

入門學習

綜述

進階論文

1997

  1. Neco, R. P., & Forcada, M. L. (1997, June). Asynchronous translations with recurrent neural nets. In Neural Networks, 1997., International Conference on (Vol. 4, pp. 2535-2540). IEEE.
    [http://ieeexplore.ieee.org/document/614693/]

2003

  1. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155.
    [http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf]
  2. Pascanu, R., Mikolov, T., & Bengio, Y. (2013, February). On the difficulty of training recurrent neural networks. In International Conference on Machine Learning (pp. 1310-1318).
    [http://arxiv.org/abs/1211.5063]

2010

  1. Sudoh, K., Duh, K., Tsukada, H., Hirao, T., & Nagata, M. (2010, July). Divide and translate: improving long distance reordering in statistical machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR (pp. 418-427). Association for Computational Linguistics.
    [https://dl.acm.org/citation.cfm?id=1868912]

2013

  1. Kalchbrenner, N., & Blunsom, P. (2013, October). Recurrent Continuous Translation Models. In EMNLP (Vol. 3, No. 39, p. 413).
    [https://www.researchgate.net/publication/289758666_Recurrent_continuous_translation_models]

2014

  1. Mnih, V., Heess, N., & Graves, A. (2014). Recurrent models of visual attention. In Advances in neural information processing systems (pp. 2204-2212)
    [http://arxiv.org/abs/1406.6247]
  2. Sutskever, I., Vinyals, O., & Le, Q. V. Sequence to sequence learning with neural networks. In Advances in neural information processing systems(pp. 3104-3112).
    [https://arxiv.org/abs/1409.3215]
  3. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. . Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
    [http://arxiv.org/abs/1406.1078]
  4. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    [https://arxiv.org/abs/1409.0473]
  5. Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2014). On using very large target vocabulary for neural machine translation. arXiv preprint arXiv:1412.2007.
    [http://arxiv.org/abs/1412.2007]
  6. Luong, M. T., Sutskever, I., Le, Q. V., Vinyals, O., & Zaremba, W. (2014). Addressing the rare word problem in neural machine translation. arXiv preprint arXiv:1410.8206.
    [http://arxiv.org/abs/1410.8206]

2015

  1. Sennrich, R., Haddow, B., & Birch, A. (2015). Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709.
    [http://arxiv.org/abs/1511.06709]
  2. Dong, D., Wu, H., He, W., Yu, D., & Wang, H. (2015). Multi-Task Learning for Multiple Language Translation. In ACL (1) (pp. 1723-1732).
    [http://www.anthology.aclweb.org/P/P15/P15-1166.pdf]
  3. Shen, S., Cheng, Y., He, Z., He, W., Wu, H., Sun, M., & Liu, Y. (2015). Minimum risk training for neural machine translation. arXiv preprint arXiv:1512.02433.
    [https://arxiv.org/abs/1512.02433]
  4. Bojar O, Chatterjee R, Federmann C, et al. Findings of the 2015 Workshop on Statistical Machine Translation[C]. Tech Workshop on Statistical Machine Translation,2015.
    [https://www-test.pure.ed.ac.uk/portal/files/23139669/W15_3001.pdfv]

2016

  1. Facebook:Convolutional Sequence to Sequence Learning Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin
    [https://arxiv.org/abs/1705.03122]
  2. Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., … & Klingner, J. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
    [https://arxiv.org/abs/1609.08144v1]
  3. Gehring, J., Auli, M., Grangier, D., & Dauphin, Y. N. (2016). A convolutional encoder model for neural machine translation. arXiv preprint arXiv:1611.02344.
    [https://arxiv.org/abs/1611.02344]
  4. Cheng, Y., Xu, W., He, Z., He, W., Wu, H., Sun, M., & Liu, Y. (2016). Semi-supervised learning for neural machine translation. arXiv preprint arXiv:1606.04596.
    [http://arxiv.org/abs/1606.04596]
  5. Wang, M., Lu, Z., Li, H., & Liu, Q. (2016). Memory-enhanced decoder for neural machine translation. arXiv preprint arXiv:1606.02003.
    [https://arxiv.org/abs/1606.02003]
  6. Sennrich, R., & Haddow, B. (2016). Linguistic input features improve neural machine translation. arXiv preprint arXiv:1606.02892.
    [http://arxiv.org/abs/1606.02892]
  7. Tu, Z., Lu, Z., Liu, Y., Liu, X., & Li, H. (2016). Modeling coverage for neural machine translation. arXiv preprint arXiv:1601.04811.
    [http://arxiv.org/abs/1601.04811]
  8. Cohn, T., Hoang, C. D. V., Vymolova, E., Yao, K., Dyer, C., & Haffari, G. (2016). Incorporating structural alignment biases into an attentional neural translation model. arXiv preprint arXiv:1601.01085.
    [http://www.m-mitchell.com/NAACL-2016/NAACL-HLT2016/pdf/N16-1102.pdf]
  9. Hitschler, J., Schamoni, S., & Riezler, S. (2016). Multimodal pivots for image caption translation. arXiv preprint arXiv:1601.03916.
    [https://arxiv.org/abs/1601.03916]
  10. Junczys-Dowmunt, M., Dwojak, T., & Hoang, H. (2016). Is neural machine translation ready for deployment. A case study on, 30.
    [https://arxiv.org/abs/1610.01108]
  11. Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., … & Hughes, M. (2016). Google』s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558.
    [https://arxiv.org/abs/1611.04558]
  12. Bartolome, Diego, and Gema Ramirez.「Beyond the Hype of Neural Machine Translation,」MIT Technology Review (May 23, 2016), bit.ly/2aG4bvR.
    [https://www.slideshare.net/TAUS/beyond-the-hype-of-neural-machine-translation-diego-bartolome-tauyou-and-gema-ramirez-prompsit-language-engineering]
  13. Crego, J., Kim, J., Klein, G., Rebollo, A., Yang, K., Senellart, J., … & Enoue, S. (2016). SYSTRAN』s Pure Neural Machine Translation Systems. arXiv preprint arXiv:1610.05540.
    [https://arxiv.org/abs/1610.05540]

2017

  1. Google:Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
    [http://arxiv.org/abs/1706.03762]
  2. Microsoft: Neural Phrase-based Machine Translation Po-Sen Huang, Chong Wang, Dengyong Zhou, Li Deng
    [http://arxiv.org/abs/1706.05565]
  3. A Neural Network for Machine Translation, at Production Scale. (2017). Research Blog. Retrieved 26 July 2017, from [https://research.googleblog.com/2016/09/a-neural-network-for-machine.html]
    [http://www.googblogs.com/a-neural-network-for-machine-translation-at-production-scale/]
  4. Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional Sequence to Sequence Learning. arXiv preprint arXiv:1705.03122.
    [https://arxiv.org/abs/1705.03122]
  5. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention Is All You Need. arXiv preprint arXiv:1706.03762.
    [https://arxiv.org/abs/1706.03762]
  6. Train Neural Machine Translation Models with Sockeye | Amazon Web Services. (2017). Amazon Web Services. Retrieved 26 July 2017, from
    [https://aws.amazon.com/blogs/ai/train-neural-machine-translation-models-with-sockeye/]
  7. Dandekar, N. (2017). How does an attention mechanism work in deep learning for natural language processing?. Quora. Retrieved 26 July 2017, from
    [https://www.quora.com/How-does-an-attention-mechanism-work-in-deep-learning-for-natural-language-processing]
  8. Microsoft Translator launching Neural Network based translations for all its speech languages. (2017). Translator. Retrieved 27 July 2017, from
    [https://blogs.msdn.microsoft.com/translation/2016/11/15/microsoft-translator-launching-neural-network-based-translations-for-all-its-speech-languages/]
  9. ACL 2017. (2017). Accepted Papers, Demonstrations and TACL Articles for ACL 2017. [online] Available at:
    [https://chairs-blog.acl2017.org/2017/04/05/accepted-papers-and-demonstrations/] [Accessed 7 Aug. 2017].

2018

  1. Miguel Domingo, Álvaro Peris and Francisco Casacuberta. 2018. Segment-based interactive-predictive machine translation. Machine Translation.[https://www.researchgate.net/publication/322275484_Segment-based_interactive-predictive_machine_translation] [Citation: 2]

  2. Xin Wang, Wenhu Chen, Yuan-Fang Wang, and William Yang Wang. 2018. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. In Proceedings of ACL 2018.[http://aclweb.org/anthology/P18-1083] [Citation: 10]

  3. Arun Tejasvi Chaganty, Stephen Mussman, and Percy Liang. 2018. The price of debiasing automatic metrics in natural language evaluation.[https://arxiv.org/pdf/1807.02202] [In Proceedings of ACL 2018.]

  4. Xin Wang, Wenhu Chen, Yuan-Fang Wang, and William Yang Wang. 2018.No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. In Proceedings of ACL 2018. (Citation: 10)

  5. Arun Tejasvi Chaganty, Stephen Mussman, and Percy Liang. 2018.The price of debiasing automatic metrics in natural language evaluation. In Proceedings of ACL 2018.

  6. Lukasz Kaiser, Aidan N. Gomez, and Francois Chollet. 2018.Depthwise Separable Convolutions for Neural Machine Translation. In Proceedings of ICLR 2018. (Citation: 27)

  7. Yanyao Shen, Xu Tan, Di He, Tao Qin, and Tie-Yan Liu. 2018.Dense Information Flow for Neural Machine Translation. In Proceedings of NAACL 2018. (Citation: 3)

  8. Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, and Ming Zhou. 2018.Generative Bridging Network for Neural Sequence Prediction. In Proceedings of NAACL 2018. (Citation: 3)

  9. Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu, and Macduff Hughes. 2018.The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation. In Proceedings of ACL 2018. (Citation: 22)

  10. Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan, and Hermann Ney. 2018.Neural Hidden Markov Model for Machine Translation. In Proceedings of ACL 2018. (Citation: 3)

  11. Jingjing Gong, Xipeng Qiu, Shaojing Wang, and Xuanjing Huang. 2018.Information Aggregation via Dynamic Routing for Sequence Encoding. In COLING 2018.

  12. Qiang Wang, Fuxue Li, Tong Xiao, Yanyang Li, Yinqiao Li, and Jingbo Zhu. 2018.Multi-layer Representation Fusion for Neural Machine Translation. In Proceedings of COLING 2018 .

  13. Yachao Li, Junhui Li, and Min Zhang. 2018.Adaptive Weighting for Neural Machine Translation. In Proceedings of COLING 2018 .

  14. Kaitao Song, Xu Tan, Di He, Jianfeng Lu, Tao Qin, and Tie-Yan Liu. 2018.Double Path Networks for Sequence to Sequence Learning. In Proceedings of COLING 2018 .

  15. Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Shuming Shi, and Tong Zhang. 2018.Exploiting Deep Representations for Neural Machine Translation. In Proceedings of EMNLP 2018 . (Citation: 1)

  16. Biao Zhang, Deyi Xiong, Jinsong Su, Qian Lin, and Huiji Zhang. 2018.Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks. In Proceedings of EMNLP 2018 .

  17. Gongbo Tang, Mathias Müller, Annette Rios, and Rico Sennrich. 2018.Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures. In Proceedings of EMNLP 2018 . (Citation: 6)

  18. Ke Tran, Arianna Bisazza, and Christof Monz. 2018.The Importance of Being Recurrent for Modeling Hierarchical Structure. In Proceedings of EMNLP 2018 . (Citation: 6)

  19. Parnia Bahar, Christopher Brix, and Hermann Ney. 2018.Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation. In Proceedings of EMNLP 2018 . (Citation: 1)

  20. Tianyu He, Xu Tan, Yingce Xia, Di He, Tao Qin, Zhibo Chen, and Tie-Yan Liu. 2018.Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation. In Proceedings of NeurIPS 2018 . (Citation: 2)

  21. Harshil Shah and David Barber. 2018.Generative Neural Machine Translation. In Proceedings of NeurIPS 2018 .

  22. Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dongdong Zhang, Zhirui Zhang, and Ming Zhou. 2018.Achieving Human Parity on Automatic Chinese to English News Translation. Technical report. Microsoft AI & Research. (Citation: 41)

  23. Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018.DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding. In Proceedings of AAAI 2018 . (Citation: 60)

  24. Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, and Chengqi Zhang. 2018.Bi-directional Block Self-attention for Fast and Memory-efficient Sequence Modeling. In Proceedings of ICLR 2018 . (Citation: 13)

  25. Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang. 2018.Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling. In Proceedings of IJCAI 2018 . (Citation: 18)

  26. Peter Shaw, Jakob Uszkorei, and Ashish Vaswani. 2018.Self-Attention with Relative Position Representations. In Proceedings of NAACL 2018 . (Citation: 24)

  27. Lesly Miculicich Werlen, Nikolaos Pappas, Dhananjay Ram, and Andrei Popescu-Belis. 2018.Self-Attentive Residual Decoder for Neural Machine Translation. In Proceedings of NAACL 2018 . (Citation: 3)

  28. Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, and Max Meng. 2018.Target Foresight Based Attention for Neural Machine Translation. In Proceedings of NAACL 2018 .

  29. Biao Zhang, Deyi Xiong, and Jinsong Su. 2018.Accelerating Neural Transformer via an Average Attention Network. In Proceedings of ACL 2018 . (Citation: 5)

  30. Tobias Domhan. 2018.How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures. In Proceedings of ACL 2018 . (Citation: 3)

  31. Shaohui Kuang, Junhui Li, António Branco, Weihua Luo, and Deyi Xiong. 2018.Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings. In Proceedings of ACL 2018 . (Citation: 1)

  32. Chaitanya Malaviya, Pedro Ferreira, and André F. T. Martins. 2018.Sparse and Constrained Attention for Neural Machine Translation. In Proceedings of ACL 2018 . (Citation: 4)

  33. Jian Li, Zhaopeng Tu, Baosong Yang, Michael R. Lyu, and Tong Zhang. 2018.Multi-Head Attention with Disagreement Regularization. In Proceedings of EMNLP 2018 . (Citation: 1)

  34. Wei Wu, Houfeng Wang, Tianyu Liu and Shuming Ma. 2018.Phrase-level Self-Attention Networks for Universal Sentence Encoding. In Proceedings of EMNLP 2018 .

  35. Baosong Yang, Zhaopeng Tu, Derek F. Wong, Fandong Meng, Lidia S. Chao, and Tong Zhang. 2018.Modeling Localness for Self-Attention Networks. In Proceedings of EMNLP 2018 . (Citation: 2)

  36. Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, and Qi Su. 2018.Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation. In Proceedings of EMNLP 2018 .

  37. Shiv Shankar, Siddhant Garg, and Sunita Sarawagi. 2018.Surprisingly Easy Hard-Attention for Sequence to Sequence Learning. In Proceedings of EMNLP 2018 .

  38. Ankur Bapna, Mia Chen, Orhan Firat, Yuan Cao, and Yonghui Wu. 2018.Training Deeper Neural Machine Translation Models with Transparent Attention. In Proceedings of EMNLP 2018 .

  39. Hareesh Bahuleyan, Lili Mou, Olga Vechtomova, and Pascal Poupart. 2018.Variational Attention for Sequence-to-Sequence Models. In Proceedings of COLING 2018 . (Citation: 14)

  40. Maha Elbayad, Laurent Besacier, and Jakob Verbeek. 2018.Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction. In Proceedings of CoNLL 2018 . (Citation: 4)

  41. Yuntian Deng, Yoon Kim, Justin Chiu, Demi Guo, and Alexander M. Rush. 2018Latent Alignment and Variational Attention. In Proceedings of NeurIPS 2018 . (Citation)

  42. Peyman Passban, Qun Liu, and Andy Way. 2018.Improving Character-Based Decoding Using Target-Side Morphological Information for Neural Machine Translation. In Proceedings of NAACL 2018 . (Citation: 5)

  43. Huadong Chen, Shujian Huang, David Chiang, Xinyu Dai, and Jiajun Chen. 2018.Combining Character and Word Information in Neural Machine Translation Using a Multi-Level Attention. In Proceedings of NAACL 2018 .

  44. Frederick Liu, Han Lu, and Graham Neubig. 2018.Handling Homographs in Neural Machine Translation. In Proceedings of NAACL 2018 . (Citation: 8)

  45. Taku Kudo. 2018.Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In Proceedings of ACL 2018 . (Citation: 17)

  46. Makoto Morishita, Jun Suzuki, and Masaaki Nagata. 2018.Improving Neural Machine Translation by Incorporating Hierarchical Subword Features. In Proceedings of COLING 2018 .

  47. Yang Zhao, Jiajun Zhang, Zhongjun He, Chengqing Zong, and Hua Wu. 2018.Addressing Troublesome Words in Neural Machine Translation. In Proceedings of EMNLP 2018 .

  48. Colin Cherry, George Foster, Ankur Bapna, Orhan Firat, and Wolfgang Macherey. 2018.Revisiting Character-Based Neural Machine Translation with Capacity and Compression. In Proceedings of EMNLP 2018 . (Citation: 1)

  49. Rebecca Knowles and Philipp Koehn. 2018.Context and Copying in Neural Machine Translation. In Proceedings of EMNLP 2018 .

  50. Sergey Edunov, Myle Ott, Michael Auli, David Grangier, and Marc’Aurelio Ranzato. 2018.Classical Structured Prediction Losses for Sequence to Sequence Learning. In Proceedings of NAACL 2018 . (Citation: 20)

  51. Zihang Dai, Qizhe Xie, and Eduard Hovy. 2018.From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction. In Proceedings of ACL 2018 . (Citation: 1)

  52. Zhen Yang, Wei Chen, Feng Wang, and Bo Xu. 2018.Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets. In Proceedings of NAACL 2018 . (Citation: 43)

  53. Kevin Clark, Minh-Thang Luong, Christopher D. Manning, and Quoc Le. 2018.Semi-Supervised Sequence Modeling with Cross-View Training. In Proceedings of EMNLP 2018 .

  54. Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, and Tie-Yan Liu. 2018.A Study of Reinforcement Learning for Neural Machine Translation. In Proceedings of EMNLP 2018 . (Citation: 2)

  55. Jason Lee, Elman Mansimov, and Kyunghyun Cho. 2018.Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement. In Proceedings of EMNLP 2018 .

  56. Semih Yavuz, Chung-Cheng Chiu, Patrick Nguyen, and Yonghui Wu. 2018.CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization. In Proceedings of EMNLP 2018 .

  57. Lijun Wu, Fei Tian, Yingce Xia, Yang Fan, Tao Qin, Jianhuang Lai, and Tie-Yan Liu. 2018.Learning to Teach with Dynamic Loss Functions. In Proceedings of NeurIPS 2018 .

  58. Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K. Li, and Richard Socher. 2018.Non-Autoregressive Neural Machine Translation. In Proceedings of ICLR 2018 . (Citation: 23)

  59. Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, and Noam Shazeer. 2018.Fast Decoding in Sequence Models Using Discrete Latent Variables. In Proceedings of ICML 2018 . (Citation: 3)

  60. Xiangwen Zhang, Jinsong Su, Yue Qin, Yang Liu, Rongrong Ji, and Hongji Wang. 2018.Asynchronous Bidirectional Decoding for Neural Machine Translation. In Proceedings of AAAI 2018 . (Citation: 10)

  61. Jiatao Gu, Daniel Jiwoong Im, and Victor O.K. Li. 2018.Neural machine translation with gumbel-greedy decoding. In Proceedings of AAAI 2018 . (Citation: 5)

  62. Philip Schulz, Wilker Aziz, and Trevor Cohn. 2018.A Stochastic Decoder for Neural Machine Translation. In Proceedings of ACL 2018 . (Citation: 3)

  63. Raphael Shu and Hideki Nakayama. 2018.Improving Beam Search by Removing Monotonic Constraint for Neural Machine Translation. In Proceedings of ACL 2018 .

  64. Junyang Lin, Xu Sun, Xuancheng Ren, Shuming Ma, Jinsong Su, and Qi Su. 2018.Deconvolution-Based Global Decoding for Neural Machine Translation. In Proceedings of COLING 2018 . (Citation: 2)

  65. Chunqi Wang, Ji Zhang, and Haiqing Chen. 2018.Semi-Autoregressive Neural Machine Translation. In Proceedings of EMNLP 2018 .

  66. Xinwei Geng, Xiaocheng Feng, Bing Qin, and Ting Liu. 2018.Adaptive Multi-pass Decoder for Neural Machine Translation. In Proceedings of EMNLP 2018 .

  67. Wen Zhang, Liang Huang, Yang Feng, Lei Shen, and Qun Liu. 2018.Speeding Up Neural Machine Translation Decoding by Cube Pruning. In Proceedings of EMNLP 2018 .

  68. Xinyi Wang, Hieu Pham, Pengcheng Yin, and Graham Neubig. 2018.A Tree-based Decoder for Neural Machine Translation. In Proceedings of EMNLP 2018 . (Citation: 1)

  69. Chenze Shao, Xilin Chen, and Yang Feng. 2018.Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation. In Proceedings of EMNLP 2018 .

  70. Zhisong Zhang, Rui Wang, Masao Utiyama, Eiichiro Sumita, and Hai Zhao. 2018.Exploring Recombination for Efficient Decoding of Neural Machine Translation. In Proceedings of EMNLP 2018 .

  71. Jetic Gū, Hassan S. Shavarani, and Anoop Sarkar. 2018.Top-down Tree Structured Decoding with Syntactic Connections for Neural Machine Translation and Parsing. In Proceedings of EMNLP 2018 .

  72. Yilin Yang, Liang Huang, and Mingbo Ma. 2018.Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation. In Proceedings of EMNLP 2018 . (Citation: 3)

  73. Yun Chen, Victor O.K. Li, Kyunghyun Cho, and Samuel R. Bowman. 2018.A Stable and Effective Learning Strategy for Trainable Greedy Decoding. In Proceedings of EMNLP 2018 .

2019

  1. Graham Neubig, Zi-Yi Dou, Junjie Hu, Paul Michel, Danish Pruthi, and Xinyi Wang. 2019.compare-mt: A Tool for Holistic Comparison of Language Generation Systems. In Proceedings of NAACL 2019 .
  2. Robert Schwarzenberg, David Harbecke, Vivien Macketanz, Eleftherios Avramidis, and Sebastian Möller. 2019.Train, Sort, Explain: Learning to Diagnose Translation Models. In Proceedings of NAACL 2019 .
  3. Nitika Mathur, Timothy Baldwin, and Trevor Cohn. 2019.Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation. In Proceedings of ACL 2019 .
  4. Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, and Preslav Nakov. 2019.Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite. In Proceedings of ACL 2019 .
  5. Yikang Shen, Shawn Tan, Alessandro Sordoni, and Aaron Courville. 2019.Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks. In Proceedings of ICLR 2019 .
  6. Felix Wu, Angela Fan, Alexei Baevski, Yann Dauphin, and Michael Auli. 2019.Pay Less Attention with Lightweight and Dynamic Convolutions. In Proceedings of ICLR 2019 . (Citation: 1)
  7. Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Lukasz Kaiser. 2019.Universal Transformers. In Proceedings of ICLR 2019 . (Citation: 12)
  8. Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Longyue Wang, Shuming Shi, and Tong Zhang. 2019.Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement. In Proceedings of AAAI 2019 .
  9. Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, and Ruslan Salakhutdinov. 2019.Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. In Proceedings of ACL 2019 . (Citation: 8)
  10. Wenpeng Yin and Hinrich Schütze. 2019.Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms. Transactions of the Association for Computational Linguistics .
  11. Shiv Shankar and Sunita Sarawagi. 2019.Posterior Attention Models for Sequence to Sequence Learning. In Proceedings of ICLR 2019 .
  12. Baosong Yang, Jian Li, Derek Wong, Lidia S. Chao, Xing Wang, and Zhaopeng Tu. 2019.Context-Aware Self-Attention Networks. In Proceedings of AAAI 2019 .
  13. Reza Ghaeini, Xiaoli Z. Fern, Hamed Shahbazi, and Prasad Tadepalli. 2019.Saliency Learning: Teaching the Model Where to Pay Attention. In Proceedings of NAACL 2019 .
  14. Sameen Maruf, André F. T. Martins, and Gholamreza Haffari. 2019.Selective Attention for Context-aware Neural Machine Translation. In Proceedings of NAACL 2019 .
  15. Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, and Armand Joulin. 2019.Adaptive Attention Span in Transformers. In Proceedings of ACL 2019 .
  16. Yiren Wang, Yingce Xia, Tianyu He, Fei Tian, Tao Qin, ChengXiang Zhai, and Tie-Yan Liu. 2019.Multi-Agent Dual Learning. In Proceedings of ICLR 2019 .
  17. Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, and Lawrence Carin. 2019.Improving Sequence-to-Sequence Learning via Optimal Transport. In Proceedings of ICLR 2019 .
  18. Sachin Kumar and Yulia Tsvetkov. 2019.Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs. In Proceedings of ICLR 2019 . 19 Xing Niu, Weijia Xu, and Marine Carpuat. 2019.Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation. In Proceedings of NAACL 2019 .
  19. Weijia Xu, Xing Niu, and Marine Carpuat. 2019.Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation. In Proceedings of NAACL 2019 .
  20. Junliang Guo, Xu Tan, Di He, Tao Qin, Linli Xu, and Tie-Yan Liu. 2019.Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input. In Proceedings of AAAI 2019 . (Citation: 2)
  21. Yiren Wang, Fei Tian, Di He, Tao Qin, ChengXiang Zhai, and Tie-Yan Liu. 2019.Non-Autoregressive Machine Translation with Auxiliary Regularization. In Proceedings of AAAI 2019 .
  22. Wouter Kool, Herke van Hoof, and Max Welling. 2019.Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement. In Proceedings of ICML 2019 .
  23. Ashwin Kalyan, Peter Anderson, Stefan Lee, and Dhruv Batra. 2019.Trainable Decoding of Sets of Sequences for Neural Sequence Models. In Proceedings of ICML 2019 .
  24. Eldan Cohen and Christopher Beck. 2019.Empirical Analysis of Beam Search Performance Degradation in Neural Sequence Models. In Proceedings of ICML 2019 .
  25. Kartik Goyal, Chris Dyer, and Taylor Berg-Kirkpatrick. 2019.An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search. In Proceedings of NAACL 2019 .
  26. Rico Sennrich and Biao Zhang. 2019.Revisiting Low-Resource Neural Machine Translation: A Case Study. In Proceedings of ACL 2019 .
  27. Shuo Wang, Yang Liu, Chao Wang, Huanbo Luan, and Maosong Sun. 2019.Improving Back-Translation with Uncertainty-based Confidence Estimation. In Proceedings of EMNLP 2019 .
  28. Jiawei Wu, Xin Wang, and William Yang Wang. 2019.Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation. In Proceedings of NAACL 2019 .
  29. Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, and Jonathan May. 2019.Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation. In Proceedings of ACL 2019 .
  30. Jiaming Luo, Yuan Cao, and Regina Barzilay. 2019.Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B. In Proceedings of ACL 2019 .
  31. Yichong Leng, Xu Tan, Tao Qin, Xiang-Yang Li, and Tie-Yan Liu. 2019.Unsupervised Pivot Translation for Distant Languages. In Proceedings of ACL 2019 .
  32. Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2019.An Effective Approach to Unsupervised Machine Translation. In Proceedings of ACL 2019 .
  33. Mengzhou Xia, Xiang Kong, Antonios Anastasopoulos, and Graham Neubig. 2019.Generalized Data Augmentation for Low-Resource Translation. In Proceedings of ACL 2019 .
  34. Jinhua Zhu, Fei Gao, Lijun Wu, Yingce Xia, Tao Qin, Wengang Zhou, Xueqi Cheng, and Tie-Yan Liu. 2019.Soft Contextual Data Augmentation for Neural Machine Translation. In Proceedings of ACL 2019 .
  35. Chunting Zhou, Xuezhe Ma, Junjie Hu, and Graham Neubig. 2019.Handling Syntactic Divergence in Low-resource Machine Translation. In Proceedings of EMNLP 2019 .
  36. Yuanpeng Li, Liang Zhao, Jianyu Wang, and Joel Hestness. 2019.Compositional Generalization for Primitive Substitutions. In Proceedings of EMNLP 2019 .
  37. Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, and Hermann Ney. 2019.Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages. In Proceedings of EMNLP 2019 .
  38. Yunsu Kim, Yingbo Gao, and Hermann Ney. 2019.Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies. In Proceedings of ACL 2019 .
  39. Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, and Tie-Yan Liu. 2019.Multilingual Neural Machine Translation with Knowledge Distillation. In Proceedings of ICLR 2019 .
  40. Xinyi Wang, Hieu Pham, Philip Arthur, and Graham Neubig. 2019.Multilingual Neural Machine Translation With Soft Decoupled Encoding. In Proceedings of ICLR 2019 .
  41. Maruan Al-Shedivat and Ankur P. Parikh. 2019.Consistency by Agreement in Zero-shot Neural Machine Translation. In Proceedings of NAACL 2019 .
  42. Roee Aharoni, Melvin Johnson, and Orhan Firat. 2019.Massively Multilingual Neural Machine Translation. In Proceedings of NAACL 2019 .
  43. Yunsu Kim, Yingbo Gao, and Hermann Ney. 2019.Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies. In Proceedings of ACL 2019 .
  44. Inigo Jauregi Unanue, Ehsan Zare Borzeshi, Nazanin Esmaili, and Massimo Piccardi.ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems. In Proceedings of NAACL 2019 .
  45. Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, and Min Zhang. 2019.Code-Switching for Enhancing NMT with Pre-Specified Translation. In Proceedings of NAACL 2019 .
  46. Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, and Jingbo Zhu. 2019.Shared-Private Bilingual Word Embeddings for Neural Machine Translation. In Proceedings of ACL 2019 .
  47. Georgiana Dinu, Prashant Mathur, Marcello Federico, and Yaser Al-Onaizan. 2019.Training Neural Machine Translation to Apply Terminology Constraints. In Proceedings of ACL 2019 .
  48. Rudra Murthy V, Anoop Kunchukuttan, and Pushpak Bhattacharyya. 2019.Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages. In Proceedings of NAACL 2019 .
  49. Meishan Zhang, Zhenghua Li, Guohong Fu, and Min Zhang. 2019.Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations. In Proceedings of NAACL 2019 .
  50. Linfeng Song, Daniel Gildea, Yue Zhang, Zhiguo Wang, and Jinsong Su. 2019.Semantic Neural Machine Translation Using AMR. Transactions of the Association for Computational Linguistics .
  51. Nader Akoury, Kalpesh Krishna, and Mohit Iyyer. 2019.Syntactically Supervised Transformers for Faster Neural Machine Translation. In Proceedings of ACL 2019 .
  52. Antonios Anastasopoulos, Alison Lui, Toan Nguyen, and David Chiang. 2019.Neural Machine Translation of Text from Non-Native Speakers. In Proceedings of NAACL 2019 .
  53. Paul Michel, Xian Li, Graham Neubig, and Juan Miguel Pino. 2019.On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models. In Proceedings of NAACL 2019 .
  54. Vaibhav Vaibhav, Sumeet Singh, Craig Stewart, and Graham Neubig. 2019.Improving Robustness of Machine Translation with Synthetic Noise. In Proceedings of NAACL 2019 .
  55. Yong Cheng, Lu Jiang, and Wolfgang Macherey. 2019.Robust Neural Machine Translation with Doubly Adversarial Inputs. In Proceedings of ACL 2019 .
  56. Hairong Liu, Mingbo Ma, Liang Huang, Hao Xiong, and Zhongjun He. 2019.Robust Neural Machine Translation with Joint Textual and Phonetic Embedding. In Proceedings of ACL 2019 .
  57. Yonatan Belinkov, and James Glass. 2019.Analysis Methods in Neural Language Processing: A Survey. Transactions of the Association for Computational Linguistics .
  58. Sofia Serrano and Noah A. Smith. 2019.Is Attention Interpretable?. In Proceedings of ACL 2019 .
  59. Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, and Ivan Titov. 2019.Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned. In Proceedings of ACL 2019 .
  60. Joris Baan, Jana Leible, Mitja Nikolaus, David Rau, Dennis Ulmer, Tim Baumgärtner, Dieuwke Hupkes, and Elia Bruni. 2019.On the Realization of Compositionality in Neural Networks. In Proceedings of ACL 2019 .
  61. Jesse Vig and Yonatan Belinkov. 2019.Analyzing the Structure of Attention in a Transformer Language Model. In Proceedings of ACL 2019 .
  62. Ashwin Kalyan, Peter Anderson, Stefan Lee, and Dhruv Batra. 2019.Trainable Decoding of Sets of Sequences for Neural Sequence Models. In Proceedings of ICML 2019 .
  63. Tianxiao Shen, Myle Ott, Michael Auli, and Marc’Aurelio Ranzato. 2019.Mixture Models for Diverse Machine Translation: Tricks of the Trade. In Proceedings of ICML 2019 .
  64. Wouter Kool, Herke van Hoof, and Max Welling. 2019.Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement. In Proceedings of ICML 2019 .
  65. Won Ik Cho, Ji Won Kim, Seok Min Kim, and Nam Soo Kim. 2019.On Measuring Gender Bias in Translation of Gender-neutral Pronouns. In Proceedings of ACL 2019 .
  66. Gabriel Stanovsky, Noah A. Smith, and Luke Zettlemoyer. 2019.Evaluating Gender Bias in Machine Translation. In Proceedings of ACL 2019 .
  67. Guillaume Lample and Alexis Conneau. 2019.Cross-lingual Language Model Pretraining. arXiv:1901.07291 . (Citation: 3)
  68. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018.BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL 2019 . (Citation: 292)
  69. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019.Language Models are Unsupervised Multitask Learners. Technical Report, OpenAI. (Citation: 9) 71.Sergey Edunov, Alexei Baevski, and Michael Auli. 2019.Pre-trained Language Model Representations for Language Generation. In Proceedings of NAACL 2019 . (Citation: 1)
  70. Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019.MASS: Masked Sequence to Sequence Pre-training for Language Generation. In Proceedings of ICML 2019 .
  71. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019.XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv:1906.08237 .
  72. Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit. 2019.Insertion Transformer: Flexible Sequence Generation via Insertion Operations. In Proceedings of ICML 2019.
  73. Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez, and Sharon Goldwater. 2019.Pre-training on high-resource speech recognition improves low-resource speech-to-text translation. In Proceedings of NAACL 2019 .
  74. Nikolai Vogler, Craig Stewart, and Graham Neubig. 2019.Lost in Interpretation: Predicting Untranslated Terminology in Simultaneous Interpretation. In Proceedings of NAACL 2019 .
  75. Elizabeth Salesky, Matthias Sperber, and Alex Waibel. 2019.Fluent Translations from Disfluent Speech in End-to-End Speech Translation. In Proceedings of NAACL 2019 .
  76. Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, and Colin Raffel. 2019.Monotonic Infinite Lookback Attention for Simultaneous Machine Translation. In Proceedings of ACL 2019 .

Tutorial

  1. ACL 2016 Tutorial -- Neural Machine Translation Lmthang在ACL 2016上所做的tutorial [http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf]
  2. 神經機器翻譯前沿進展 由清華大學的劉洋老師在第十二屆全國機器翻譯討論會(2016年8月在烏魯木齊舉辦)上做的報告 [http://nlp.csai.tsinghua.edu.cn/~ly/talks/cwmt2016_ly_v3_160826.pptx]
  3. CCL2016 | T1B: 深度學習與機器翻譯 第十五屆全國計算語言學會議(CCL 2016) [http://www.cips-cl.org/static/CCL2016/tutorialsT1B.html]
  4. Neural Machine Translation [http://statmt.org/mtma16/uploads/mtma16-neural.pdf]
  5. ACL2016上Thang Luong,Kyunghyun Cho和Christopher Manning的講習班 [https://sites.google.com/site/acl16nmt/]
  6. Kyunghyun Cho的talk : New Territory of Machine Translation,主要是講cho自己所關注的NMT問題 [https://drive.google.com/file/d/0B16RwCMQqrtdRVotWlQ3T2ZXTmM/view]

視頻教程

  1. cs224d neural machine translation [https://cs224d.stanford.edu/lectures/CS224d-Lecture15.pdf] [https://www.youtube.com/watch?v=IxQtK2SjWWM&index=11&list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6]
  2. 清華大學劉洋:基於深度學習的機器翻譯
  3. A Practical Guide to Neural Machine Translation [https://www.youtube.com/watch?v=vxibD6VaOfI]

代碼

  1. seq2seq 實現了穀歌提出的seq2seq模型,基於TensorFlow框架開發。 [https://github.com/tensorflow/tensorflow]
  2. nmt.matlab 由Stanford的博士Lmthang開源的,代碼由Matlab所寫。[https://github.com/lmthang/nmt.matlab]
  3. GroundHog 實現了基於注意力機製的神經機器翻譯模型,由Bengio研究組,基於Theano框架開發。 [https://github.com/lisa-groundhog/GroundHog]
  4. NMT-Coverage 實現了基於覆蓋率的神經機器翻譯模型,由華為諾亞方舟實驗室李航團隊,基於Theano框架開發。 [https://github.com/tuzhaopeng/NMT-Coverage]
  5. OpenNMT 由哈佛大學NLP組開源的神經機器翻譯工具包,基於Torch框架開發,達到工業級程度。 [http://opennmt.net/]
  6. EUREKA-MangoNMT 由中科院自動化所的張家俊老師開發,采用C++。 [https://github.com/jiajunzhangnlp/EUREKA-MangoNMT]
  7. dl4mt-tutorial 基於Theano框架開發。 [https://github.com/nyu-dl/dl4mt-tutorial]

領域專家

  1. Université de Montréal: Yoshua Bengio,Dzmitry Bahdanau
  2. New York University: KyungHyun Cho
  3. Stanford University: Manning,Lmthang
  4. Google: IIya Sutskever,Quoc V.Le
  5. 中科院計算所: 劉群
  6. 東北大學: 朱靖波
  7. 清華大學: 劉洋
  8. 中科院自動化所: 宗成慶,張家俊
  9. 蘇州大學: 熊德意,張民
  10. 華為-諾亞方舟: 李航,塗兆鵬
  11. 百度: 王海峰,吳華

初步版本,水平有限,有錯誤或者不完善的地方,歡迎大家提建議和補充,會一直保持更新,本文為專知內容組原創內容,未經允許不得轉載,如需轉載請發送郵件至fangquanyi@gmail.com或 聯係微信專知小助手(Rancho_Fang)

敬請關注//www.webtourguide.com和關注專知公眾號,獲取第一手AI相關知識

VIP內容

【導讀】ACL-IJCNLP 2021是CCF A類會議,是人工智能領域自然語言處理( Natural Language Processing,NLP)方向最權威的國際會議。ACL2021計劃於今年8月1日-8月6日以線上會議形式召開. 最近字節跳動AI實驗室總監李磊重返學術界,進入加州大學聖巴巴拉分校擔任助理教授。他和王明軒給了關於預訓練時代機器翻譯的教程,非常值得關注!

預訓練是自然語言處理(NLP)[28,8,20]、計算機視覺(CV)[12,34]和自動語音識別(ASR)[3,6,24]的主導範式。通常,首先對模型進行大量未標記數據的預訓練,以捕獲豐富的輸入表示,然後通過提供上下文感知的輸入表示,或初始化下遊模型的參數進行微調,將模型應用於下遊任務。最近,自監督的預訓練和任務特定的微調範式終於完全達到了神經機器翻譯(NMT)[37,35,5]。

盡管取得了成功,但在NMT中引入一個通用的預訓練模型並非易事,而且不一定會產生有希望的結果,特別是對於資源豐富的環境。在幾個方麵仍然存在獨特的挑戰。首先,大多數預訓練方法的目標不同於下遊的NMT任務。例如,BERT[8]是一種流行的預訓練模型,其設計目的是僅使用一個轉換器編碼器進行語言理解,而NMT模型通常由一個編碼器和一個解碼器組成,以執行跨語言生成。這一差距使得運用NMT[30]的預訓練不夠可行。此外,機器翻譯本身就是一個多語言問題,但一般的NLP預訓練方法主要集中在英語語料庫上,如BERT和GPT。鑒於遷移學習在多語言機器翻譯中的成功,對NMT[7]進行多語言預訓練是非常有吸引力的。最後,語音翻譯近年來受到了廣泛的關注,而大多數的預訓練方法都側重於文本表示。如何利用預訓練的方法來提高口語翻譯水平成為一個新的挑戰。

本教程提供了一個充分利用神經機器翻譯的預訓練的全麵指導。首先,我們將簡要介紹NMT的背景、預訓練的方法,並指出將預訓練應用於NMT的主要挑戰。在此基礎上,我們將著重分析預訓練在提高非語言教學績效中的作用,如何設計更好的預訓練模式來執行特定的非語言教學任務,以及如何更好地將預訓練模式整合到非語言教學係統中。在每一部分中,我們將提供例子,討論訓練技巧,並分析在應用預訓練時轉移了什麼。

第一個主題是NMT的單語預訓練,這是研究最深入的領域之一。ELMo、GPT、MASS和BERT等單語文本表征具有優勢,顯著提高了各種自然語言處理任務的性能[25,8,28,30]。然而,NMT有幾個明顯的特點,如大的訓練數據(1000萬或更多)的可用性和基線NMT模型的高容量,這需要仔細設計預訓練。在這一部分,我們將介紹不同的預訓練方法,並分析它們在不同的機器翻譯場景(如無監督的NMT、低資源的NMT和富資源的NMT)中應用的最佳實踐[37,35]。我們將介紹使用各種策略對預訓練的模型進行微調的技術,如知識蒸餾和適配器[4,16]。

下一個話題是NMT的多語言預訓練。在此背景下,我們旨在緩解英語為中心的偏見,並建議可以建立不同語言的普遍表示,以改善大量多語言的NMT。在這部分中,我們將討論不同語言的一般表示,並分析知識如何跨語言遷移。這將有助於更好地設計多語言預訓練,特別是零樣本遷移到非英語語言對[15,27,7,26,13,17,19,23,18]。

本教程的最後一個技術部分是關於NMT的預訓練。特別地,我們關注於利用弱監督或無監督訓練數據來改進語音翻譯。在這一部分中,我們將討論在言語和文本中建立一個一般表示的可能性。並展示了文本或音頻預處理訓練如何引導NMT的文本生成[33,21,32,14,22,10,9,11,36]。

在本教程的最後,我們指出了在應用NMT預訓練時的最佳實踐。這些主題涵蓋了針對不同的NMT情景的各種預訓練方法。在本教程之後,觀眾將理解為什麼NMT預訓練不同於其他任務,以及如何充分利用NMT預訓練。重要的是,我們將深入分析預訓練如何以及為什麼在NMT中起作用,這將為未來設計特定的NMT預訓練範式提供啟發。

https://sites.cs.ucsb.edu/~lilei/TALKS/2021-ACL/

報告嘉賓:

李磊,加州大學聖巴巴拉分校擔任助理教授,曾任字節跳動人工智能實驗室總監。本科博士分別畢業於上海交通大學和卡耐基梅隆大學計算機係。曾任加州大學伯克利分校作博士後研究員和百度美國深度學習實驗室少帥科學家。曾獲2012年美國計算機學會SIGKDD最佳博士論文第二名、2017年吳文俊人工智能技術發明二等獎、2017年CCF傑出演講者、2019年CCF青竹獎。在機器學習、數據挖掘和自然語言處理領域於國際頂級學術會議發表論文100餘篇,擁有二十餘項技術發明專利。擔任CCF自然語言處理專委委員和EMNLP, NeurIPS, AAAI, IJCAI, KDD等多個會議組委成員和領域主席。

王明軒,字節跳動人工智能實驗室資深研究員,博士畢業於中國科學院計算技術研究所,主要研究方向為機器翻譯。主導研發了火山翻譯係統,服務全球過億用戶,並多次帶領團隊在 WMT 機器翻譯評測中拿到過冠軍。在 ACL、EMNLP、NAACL 等相關領域發表論文 30 多篇。擔任CCF自然語言處理專委委員和國內外多個會議組委成員。

成為VIP會員查看完整內容
0
37
0

最新論文

Training data for machine translation (MT) is often sourced from a multitude of large corpora that are multi-faceted in nature, e.g. containing contents from multiple domains or different levels of quality or complexity. Naturally, these facets do not occur with equal frequency, nor are they equally important for the test scenario at hand. In this work, we propose to optimize this balance jointly with MT model parameters to relieve system developers from manual schedule design. A multi-armed bandit is trained to dynamically choose between facets in a way that is most beneficial for the MT system. We evaluate it on three different multi-facet applications: balancing translationese and natural training data, or data from multiple domains or multiple language pairs. We find that bandit learning leads to competitive MT systems across tasks, and our analysis provides insights into its learned strategies and the underlying data sets.

0
0
0
下載
預覽
Top