深度強化學習 (DRL) 是一種使用深度學習技術擴展傳統強化學習方法的一種機器學習方法。 傳統強化學習方法的主要任務是使得主體根據從環境中獲得的獎賞能夠學習到最大化獎賞的行為。然而,傳統無模型強化學習方法需要使用函數逼近技術使得主體能夠學習出值函數或者策略。在這種情況下,深度學習強大的函數逼近能力自然成為了替代人工指定特征的最好手段並為性能更好的端到端學習的實現提供了可能。

VIP內容

摘要: 深度強化學習是人工智能領域新興技術之一, 它將深度學習強大的特征提取能力與強化學習的決策能力相結合, 實現從感知輸入到決策輸出的端到端框架, 具有較強的學習能力且應用廣泛. 然而, 已有研究表明深度強化學習存在安全漏洞, 容易受到對抗樣本攻擊. 為提高深度強化學習的魯棒性、實現係統的安全應用, 本文針對已有的研究工作, 較全麵地綜述了深度強化學習方法、對抗攻擊、防禦方法與安全性分析, 並總結深度強化學習安全領域存在的開放問題以及未來發展的趨勢, 旨在為從事相關安全研究與工程應用提供基礎.

http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.c200166

成為VIP會員查看完整內容
1
8
0

最新論文

Deep Reinforcement Learning (DRL) solutions are becoming pervasive at the edge of the network as they enable autonomous decision-making in a dynamic environment. However, to be able to adapt to the ever-changing environment, the DRL solution implemented on an embedded device has to continue to occasionally take exploratory actions even after initial convergence. In other words, the device has to occasionally take random actions and update the value function, i.e., re-train the Artificial Neural Network (ANN), to ensure its performance remains optimal. Unfortunately, embedded devices often lack processing power and energy required to train the ANN. The energy aspect is particularly challenging when the edge device is powered only by a means of Energy Harvesting (EH). To overcome this problem, we propose a two-part algorithm in which the DRL process is trained at the sink. Then the weights of the fully trained underlying ANN are periodically transferred to the EH-powered embedded device taking actions. Using an EH-powered sensor, real-world measurements dataset, and optimizing for Age of Information (AoI) metric, we demonstrate that such a DRL solution can operate without any degradation in the performance, with only a few ANN updates per day.

0
0
0
下載
預覽
參考鏈接
Top
微信掃碼谘詢專知VIP會員
Top