強化學習(RL)是機器學習的一個領域,與軟件代理應如何在環境中采取行動以最大化累積獎勵的概念有關。除了監督學習和非監督學習外,強化學習是三種基本的機器學習範式之一。 強化學習與監督學習的不同之處在於,不需要呈現帶標簽的輸入/輸出對,也不需要顯式糾正次優動作。相反,重點是在探索(未知領域)和利用(當前知識)之間找到平衡。 該環境通常以馬爾可夫決策過程(MDP)的形式陳述,因為針對這種情況的許多強化學習算法都使用動態編程技術。經典動態規劃方法和強化學習算法之間的主要區別在於,後者不假設MDP的確切數學模型,並且針對無法采用精確方法的大型MDP。

知識薈萃

強化學習 ( Reinforcement learning ) 專知薈萃

入門學習

綜述

進階論文

  1. Rasim M Alguliev, Ramiz M Aliguliyev, Makrufa S Hajirahimova, and Chingiz A Mehdiyev. 2011. MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 12 (2011), 14514–14522. [http://www.sciencedirect.com/science/article/pii/S0957417411008177]
  2. Rasim M Alguliev, Ramiz M Aliguliyev, and Nijat R Isazade. 2013. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40, 5 (2013), 1675–1689. [http://www.sciencedirect.com/science/article/pii/S0957417412010688]
  3. M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, and K. Kochut. 2017. A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques. ArXiv e-prints (2017). arXiv:1707.02919 [https://arxiv.org/abs/1707.02919]
  4. Einat Amitay and Cécile Paris. 2000. Automatically summarising web sites: is there a way around it?. In Proceedings of the ninth international conference on Information and knowledge management. ACM, 173–179. [https://dl.acm.org/citation.cfm?id=354756.354816]
  5. Elena Baralis, Luca Cagliero, Saima Jabeen, Alessandro Fiori, and Sajid Shah. 2013. Multi-document summarization based on the Yago ontology. Expert Systems with Applications 40, 17 (2013), 6976–6984. [http://www.sciencedirect.com/science/article/pii/S0957417413004429]
  6. Taylor Berg-Kirkpatrick, Dan Gillick, and Dan Klein. 2011. Jointly learning to extract and compress. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 481–490. [https://dl.acm.org/citation.cfm?id=2002534&%3bpreflayout=flat]
  7. Asli Celikyilmaz and Dilek Hakkani-Tur. 2010. A hybrid hierarchical model for multi-document summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 815–824. [https://dl.acm.org/citation.cfm?id=1858765]
  8. Ping Chen and Rakesh Verma. 2006. A query-based medical information summarization system using ontology knowledge. In Computer-Based Medical Systems, 2006. CBMS 2006. 19th IEEE International Symposium on. IEEE, 37–42. [https://dl.acm.org/citation.cfm?id=1153019]
  9. Freddy Chong Tat Chua and Sitaram Asur. 2013. Automatic Summarization of Events from Social Media.. In ICWSM. [https://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/6057/0]
  10. John M Conroy and Dianne P O’leary. 2001. Text summarization via hidden markov models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 406–407. [http://pdfs.semanticscholar.org/1213/3cfc6688cc2cdea57595b045a28b94d98f1d.pdf]
  11. Hal Daumé III and Daniel Marcu. 2006. Bayesian query-focused summarization. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 305–312. [https://dl.acm.org/citation.cfm?id=1220214]
  12. J-Y Delort, Bernadette Bouchon-Meunier, and Maria Rifqi. 2003. Enhanced web document summarization using hyperlinks. In Proceedings of the fourteenth ACM conference on Hypertext and hypermedia. ACM, 208–215. [http://dl.acm.org/citation.cfm?id=900097]
  13. Günes Erkan and Dragomir R Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res.(JAIR) 22, 1 (2004), 457–479. [https://arxiv.org/abs/1109.2128]
  14. Yihong Gong and Xin Liu. 2001. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 19–25. [https://dl.acm.org/citation.cfm?doid=383952.383955]
  15. Vishal Gupta and Gurpreet Singh Lehal. 2010. A survey of text summarization extractive techniques. Journal of Emerging Technologies in Web Intelligence 2, 3 (2010), 258–268. [http://www.learnpunjabi.org/pdf/survey-paper.pdf]
  16. Ben Hachey, Gabriel Murray, and David Reitter. 2006. Dimensionality reduction aids term co-occurrence based multi-document summarization.In Proceedings of arXiv, July 2017, USA Allahyari, M. et al the workshop on task-focused summarization and question answering. Association for Computational Linguistics, 1–7. [http://www.ltg.ed.ac.uk/np/publications/ltg/papers/Hachey2006Dimensionality.pdf]
  17. John Hannon, Kevin McCarthy, James Lynch, and Barry Smyth. 2011. Personalized and automatic social summarization of events in video. In Proceedings of the 16th international conference on Intelligent user interfaces. ACM, 335–338. [https://dl.acm.org/citation.cfm?id=1943459]
  18. Sanda Harabagiu and Finley Lacatusu. 2005. Topic themes for multi-document summarization. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 202–209. [https://dl.acm.org/citation.cfm?id=1076071]
  19. Leonhard Hennig, Winfried Umbrath, and Robert Wetzker. 2008. An ontologybased approach to text summarization. In Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT’08. IEEE/WIC/ACM International Conference on, Vol. 3. IEEE, 291–294. [http://dl.acm.org/citation.cfm?id=1487345]
  20. Meishan Hu, Aixin Sun, and Ee-Peng Lim. 2007. Comments-oriented blog summarization by sentence extraction. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM, 901–904. [https://dl.acm.org/citation.cfm?id=1321571&CFID=824361189&CFTOKEN=11022411]
  21. Meishan Hu, Aixin Sun, and Ee-Peng Lim. 2008. Comments-oriented document summarization: understanding documents with readers’ feedback. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 291–298. [https://dl.acm.org/citation.cfm?id=1390385&CFID=824361189&CFTOKEN=11022411]
  22. Elena Lloret and Manuel Palomar. 2012. Text summarisation in progress: a literature review. Artificial Intelligence Review 37, 1 (2012), 1–41. [https://link.springer.com/article/10.1007%2Fs10462-011-9216-z]
  23. Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of research and development 2, 2 (1958), 159–165. [39] Inderjeet Mani and Eric Bloedorn. 1999. Summarizing similarities and differences among related documents. Information Retrieval 1, 1-2 (1999), 35–67. [http://www.di.ubi.pt/~jpaulo/competence/general/(1958)Luhn.pdf]
  24. Inderjeet Mani, Gary Klein, David House, Lynette Hirschman, Therese Firmin, and Beth Sundheim. 2002. SUMMAC: a text summarization evaluation. Natural Language Engineering 8, 01 (2002), 43–68.
  25. Qiaozhu Mei and ChengXiang Zhai. 2008. Generating Impact-Based Summaries for Scientific Literature.. In ACL, Vol. 8. Citeseer, 816–824. [https://www.researchgate.net/publication/231901086_SUMMAC_a_text_summarization_evaluation]
  26. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. Association for Computational Linguistics. [https://digital.library.unt.edu/ark:/67531/metadc30962/]
  27. Rada Mihalcea and Paul Tarau. 2005. A language independent algorithm for single and multiple document summarization. (2005). [https://www.researchgate.net/publication/228340005_A_language_independent_algorithm_for_single_and_multiple_document_summarization]
  28. Liu Na, Li Ming-xia, Lu Ying, Tang Xiao-jun, Wang Hai-wen, and Xiao Peng. 2014. Mixture of topic model for multi-document summarization. In Control and Decision Conference (2014 CCDC), The 26th Chinese. IEEE, 5168–5172. [http://ieeexplore.ieee.org/document/6853102/metrics]
  29. Ani Nenkova and Amit Bagga. 2004. Facilitating email thread access by extractive summary generation. Recent advances in natural language processing III: selected papers from RANLP 2003 (2004), 287. [https://www.researchgate.net/publication/221303547_Facilitating_email_thread_access_by_extractive_summary_generation]
  30. Ani Nenkova and Kathleen McKeown. 2012. A survey of text summarization techniques. In Mining Text Data. Springer, 43–76 [https://www.mendeley.com/research-papers/survey-text-summarization-techniques/]
  31. Paula S Newman and John C Blitzer. 2003. Summarizing archived discussions: a beginning. In Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 273–276. [https://dl.acm.org/citation.cfm?id=604097]
  32. You Ouyang, Wenjie Li, Sujian Li, and Qin Lu. 2011. Applying regression models to query-focused multi-document summarization. Information Processing & Management 47, 2 (2011), 227–237. [http://www.sciencedirect.com/science/article/pii/S0306457310000257]
  33. Makbule Gulcin Ozsoy, Ilyas Cicekli, and Ferda Nur Alpaslan. 2010. Text summarization of turkish texts using latent semantic analysis. In Proceedings of the 23rd international conference on computational linguistics. Association for Computational Linguistics, 869–876. [https://dl.acm.org/citation.cfm?id=1873879]
  34. Vahed Qazvinian and Dragomir R Radev. 2008. Scientific paper summarization using citation summary networks. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 689–696. [https://dl.acm.org/citation.cfm?id=1599081.1599168]
  35. Vahed Qazvinian, Dragomir R Radev, Saif M Mohammad, Bonnie Dorr, David Zajic, Michael Whidby, and Taesun Moon. 2014. Generating extractive summaries of scientific paradigms. arXiv preprint arXiv:1402.0556 (2014). [https://www.researchgate.net/publication/229534087_Generating_surveys_of_scientific_paradigms]
  36. Dragomir R Radev, Eduard Hovy, and Kathleen McKeown. 2002. Introduction to the special issue on summarization. Computational linguistics 28, 4 (2002), 399–408. [https://dl.acm.org/citation.cfm?id=638178.638179]
  37. Dragomir R Radev, Hongyan Jing, and Malgorzata Budzikowska. 2000. Centroidbased summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization. Association for Computational Linguistics, 21– 30. [http://www.docin.com/p-853652484.html]
  38. Dragomir R Radev, Hongyan Jing, Małgorzata Styś, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919–938. [http://www.sciencedirect.com/science/article/pii/S0306457303000955]
  39. Owen Rambow, Lokesh Shrestha, John Chen, and Chirsty Lauridsen. 2004. Summarizing email threads. In Proceedings of HLT-NAACL 2004: Short Papers. Association for Computational Linguistics, 105–108. [https://dl.acm.org/citation.cfm?id=1614011]
  40. Zhaochun Ren, Shangsong Liang, Edgar Meij, and Maarten de Rijke. 2013. Personalized time-aware tweets summarization. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM, 513–522. [https://staff.fnwi.uva.nl/m.derijke/wp-content/papercite-data/pdf/ren-personalized-2013.pdf]
  41. Horacio Saggion and Thierry Poibeau. 2013. Automatic text summarization: Past, present and future. In Multi-source, Multilingual Information Extraction and Summarization. Springer, 3–21. [https://hal.archives-ouvertes.fr/hal-00782442/document]
  42. Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513– 523. [http://www.sciencedirect.com/science/article/pii/0306457388900210]
  43. Yogesh Sankarasubramaniam, Krishnan Ramanathan, and Subhankar Ghosh. 2014. Text summarization using Wikipedia. Information Processing & Management 50, 3 (2014), 443–461. [http://www.sciencedirect.com/science/article/pii/S0306457314000119]
  44. Beaux P Sharifi, David I Inouye, and Jugal K Kalita. 2013. Summarization of Twitter Microblogs. Comput. J. (2013), bxt109. [http://cs.uccs.edu/~jkalita/papers/2013/SharifiBeauxComputerJournal2013.pdf]
  45. E. D. Trippe, J. B. Aguilar, Y. H. Yan, M. V. Nural, J. A. Brady, M. Assefi, S. Safaei, M. Allahyari, S. Pouriyeh, M. R. Galinski, J. C. Kissinger, and J. B. Gutierrez. 2017. A Vision for Health Informatics: Introducing the SKED Framework.An Extensible Architecture for Scientific Knowledge Extraction from Data. ArXiv e-prints (2017). arXiv:1706.07992 [https://arxiv.org/abs/1706.07992]
  46. Neural Summarization by Extracting Sentences and Words [https://arxiv.org/pdf/1603.07252.pdf]
  47. Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond [https://arxiv.org/pdf/1602.06023.pdf]
  48. A Neural Attention Model for Abstractive Sentence Summarization [https://arxiv.org/pdf/1509.00685.pdf]
  49. A Deep Reinforced Model for Abstractive Summarization [https://arxiv.org/pdf/1705.04304.pdf]
  50. Text summarization using Latent Semantic Analysis [https://www.researchgate.net/publication/220195824_Text_summarization_using_Latent_Semantic_Analysis]
  51. TextRank: Bringing Order into Textshttps://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf~
  52. Sentence Extraction Based Single Document Summarization [http://oldwww.iiit.ac.in/cgi-bin/techreports/display_detail.cgi?id=IIIT/TR/2008/97]

代碼

  1. Sequence-to-Sequence with Attention Model for Text Summarization.
    [https://github.com/tensorflow/models/tree/master/research/textsum]
  2. gensim.summarization offers TextRank summarization
    https://radimrehurek.com/gensim/summarization/summariser.html

Tutorial

  1. 文本自動摘要:現狀與未來 萬小軍 北京大學 2016年10月16日 [https://pan.baidu.com/s/1nuTUrSP]
  2. Tutorial on automatic summarization [https://www.slideshare.net/dinel/orasan-ranlp2009] [https://pan.baidu.com/s/1o8bZJJk]
  3. How to Run Text Summarization with TensorFlow [https://hackernoon.com/how-to-run-text-summarization-with-tensorflow-d4472587602d]
  4. Text Summarization with Gensim [https://rare-technologies.com/text-summarization-with-gensim/]

數據集

  1. DUC 2004 [http://www.cis.upenn.edu/~nlp/corpora/sumrepo.html]
  2. Opinosis Dataset - Topic related review sentences [http://kavita-ganesan.com/opinosis-opinion-dataset]
  3. 17 Timelines [http://kavita-ganesan.com/opinosis-opinion-dataset]
  4. Legal Case Reports Data Set [http://archive.ics.uci.edu/ml/datasets/Legal+Case+Reports]

領域專家

  1. 萬小軍 清華大學 [https://sites.google.com/site/wanxiaojun1979/]
  2. 秦兵 哈工大 [https://m.weibo.cn/u/1880324342?sudaref=login.sina.com.cn&retcode=6102]
  3. 劉挺 [http://homepage.hit.edu.cn/pages/liuting]

VIP內容

摘要: 推薦係統致力於從海量數據中為用戶尋找並自動推薦有價值的信息和服務,可有效解決信息過載問題,成為大數據時代一種重要的信息技術。但推薦係統的數據稀疏性、冷啟動和可解釋性等問題,仍是製約推薦係統廣泛應用的關鍵技術難點。強化學習是一種交互學習技術,該方法通過與用戶交互並獲得反饋來實時捕捉其興趣漂移,從而動態地建模用戶偏好,可以較好地解決傳統推薦係統麵臨的經典關鍵問題。強化學習已成為近年來推薦係統領域的研究熱點。文中從綜述的角度,首先在簡要回顧推薦係統和強化學習的基礎上,分析了強化學習對推薦係統的提升思路,對近年來基於強化學習的推薦研究進行了梳理與總結,並分別對傳統強化學習推薦和深度強化學習推薦的研究情況進行總結;在此基礎上,重點總結了近年來強化學習推薦研究的若幹前沿,以及其應用研究情況。最後,對強化學習在推薦係統中應用的未來發展趨勢進行分析與展望。

http://www.jsjkx.com/CN/10.11896/jsjkx.210200085

成為VIP會員查看完整內容
0
9
0

最新內容

Reconfigurable intelligent surfaces (RISs) can assist the wireless systems in providing reliable and low-latency links to realize the requirements in Industry 4.0. In this paper, the practical phase shift optimization in a RIS-aided ultra-reliable and low-latency communication (URLLC) system at a factory setting is performed by applying a novel deep reinforcement learning (DRL) algorithm named as twin-delayed deep deterministic policy gradient (TD3). First, the system achievable rate in finite blocklength (FBL) regime is identified for each actuator then, the problem is formulated where the objective is to maximize the total achievable FBL rate, subject to non-linear amplitude response and the phase shift values constraint. Since the amplitude response equality constraint is highly non-convex and non-linear, we employ the TD3 to tackle the problem. The considered method relies on interacting RIS with industrial scenario by taking actions which are the phase shifts at the RIS elements, to maximize the total FBL rate. We assess the performance loss of the system when the RIS is non-ideal, i.e., non-linear amplitude response with/without phase quantization and compare it with ideal RIS. The numerical results show that optimizing phase shifts in non-ideal RIS via the considered TD3 method is highly beneficial to improve the performance.

0
0
0
下載
預覽

最新論文

Reconfigurable intelligent surfaces (RISs) can assist the wireless systems in providing reliable and low-latency links to realize the requirements in Industry 4.0. In this paper, the practical phase shift optimization in a RIS-aided ultra-reliable and low-latency communication (URLLC) system at a factory setting is performed by applying a novel deep reinforcement learning (DRL) algorithm named as twin-delayed deep deterministic policy gradient (TD3). First, the system achievable rate in finite blocklength (FBL) regime is identified for each actuator then, the problem is formulated where the objective is to maximize the total achievable FBL rate, subject to non-linear amplitude response and the phase shift values constraint. Since the amplitude response equality constraint is highly non-convex and non-linear, we employ the TD3 to tackle the problem. The considered method relies on interacting RIS with industrial scenario by taking actions which are the phase shifts at the RIS elements, to maximize the total FBL rate. We assess the performance loss of the system when the RIS is non-ideal, i.e., non-linear amplitude response with/without phase quantization and compare it with ideal RIS. The numerical results show that optimizing phase shifts in non-ideal RIS via the considered TD3 method is highly beneficial to improve the performance.

0
0
0
下載
預覽
Top