瀏覽紀錄

TOP

0

熱搜：

寶島少年周刊特賣82折

、

丹．布朗新書

、

突破極限的信念「陳傑憲」

、

冠軍之路(首刷贈熱血應援透卡組)

、

地縛少年花子君25【首刷限定版】

、

千年鬼

、

世界的大谷翔平

、

在小山和小山之間

NEW 紅利積點抵現金，消費購書更貼心

165反詐騙

Approximate Dynamic Programming: Solving The Curses Of Dimensionality, Second Edition

滿額折

Approximate Dynamic Programming: Solving The Curses Of Dimensionality, Second Edition

ISBN13：9780470604458
出版社：John Wiley & Sons Inc
作者：Powell
出版日：2011/09/22
裝訂／頁數：精裝／656頁
關鍵字： Approximate Dynamic Programming: Solving The Curses Of Dimensionality, Second Edition、 Approximate、 Dynamic、 Programming、 Solving、 The、 Curses、 Of、 Dimensionality、 Second、 Edition、 John Wiley & Sons Inc、 Powell、外文書、自然科普、數學、

定價

：NT$ 5926 元

優惠價

： 90 折 5333 元

若需訂購本書，請電洽客服 02-25006600[分機130、131]。

無法訂購

商品簡介

目次

商品簡介

Understanding approximate dynamic programming (ADP) in large industrial settings helps develop practical and high-quality solutions to problems that involve making decisions in the presence of uncertainty. With a focus on modeling and algorithms in conjunction with the language of mainstream operations research, artificial intelligence, and control theory, this second edition of Approximate Dynamic Programming Solving the Curses of Dimensionality uniquely integrates four distinct disciplines Markov design processes, mathematical programming, simulation, and statistics to show students, practitioners, and researchers how to successfully model and solve a wide range of real-life problems using ADP.

目次

Preface.

Acknowledgments.

1. The challenges of dynamic programming.

1.1 A dynamic programming example: a shortest path problem.

1.2 The three curses of dimensionality.

1.3 Some real applications.

1.4 Problem classes.

1.5 The many dialects of dynamic programming.

1.6 What is new in this book?

1.7 Pedagogy.

1.8 Bibliographic notes.

2. Some illustrative models.

2.1 Deterministic problems.

2.2 Stochastic problems.

2.3 Information acquisition problems.

2.4 A simple modeling framework for dynamic programs.

2.5 Bibliographic notes.

Problems.

3. Introduction to Markov decision processes.

3.1 The optimality equations.

3.2 Finite horizon problems.

3.3 Infinite horizon problems.

3.4 Value iteration.

3.5 Policy iteration.

3.6 Hybrid valuepolicy iteration.

3.7 Average reward dynamic programming.

3.8 The linear programming method for dynamic programs.

3.9 Monotone policies.

3.10 Why does it work?

3.11 Bibliographic notes.

Problems.

4. Introduction to approximate dynamic programming.

4.1 The three curses of dimensionality (revisited).

4.2 The basic idea.

4.3 Qlearning and SARSA.

4.4 Real time dynamic programming.

4.5 Approximate value iteration.

4.6 The postdecision state variable.

4.7 Lowdimensional representations of value functions.

4.8 So just what is approximate dynamic programming?

4.9 Experimental issues.

4.10 But does it work?

4.11 Bibliographic notes.

Problems.

5. Modeling dynamic programs.

5.1 Notational style.

5.2 Modeling time.

5.3 Modeling resources.

5.4 The states of our system.

5.5 Modeling decisions.

5.6 The exogenous information process.

5.7 The transition function.

5.8 The objective function.

5.9 A measuretheoretic view of information.

5.10 Bibliographic notes.

Problems.

6. Policies.

6.1 Myopic policies.

6.2 Lookahead policies.

6.3 Policy function approximations.

6.4 Value function approximations.

6.5 Hybrid strategies approximations.

6.6 Randomized policies.

6.7 How to choose a policy?

6.8 Bibliographic notes.

Problems.

7. Policy search.

7.1 Background.

7.2 Gradient search.

7.3 Direct policy search for finite alternatives.

7.4 The knowledge gradient algorithm for discrete alternatives.

7.5 Simulation optimization.

7.6 Why does it work?

7.7 Bibliographic notes.

Problems.

8. Approximating value functions.

8.1 Lookup tables and aggregation.

8.2 Parametric models.

8.3 Regression variations.

8.4 Nonparametric models.

8.5 Approximations and the curse of dimensionality.

8.6 Why does it work?

8.7 Bibliographic notes.

Problems.

9. Learning value function approximations.

9.1 Sampling the value of a policy.

9.2 Stochastic approximation methods.

9.3 Recursive least squares for linear models.

9.4 Temporal difference learning with a linear model.

9.5 Bellman’s equation using a linear model.

9.6 Analysis of TD(O), LSTD and LSPE using a single state.

9.7 Gradientbased.

9.8 Least squares temporal differencing with kernel regression.

9.9 Value function approximations based on Bayesian learning.

9.10 Why does it work.

9.11 Bibliographic notes.

Problems.

10. Optimizing while learning.

10.1 Overview of algorithmic strategies.

10.2 Approximate value iteration and Qlearning using lookup tables.

10.3 Statistical bias in the max operator.

10.4 Approximate value iteration and Qlearning using linear models.

10.5 Approximate policy iteration.

10.6 The actorcritic paradigm.

10.7 Policy gradient methods.

10.8 The linear programming method using basis functions.

10.9 Approximate policy iteration using kernel regression.

10.10 Finite horizon approximations for steadystate applications.

10.11 Bibliographic notes.

Problems.

11. Adaptive estimation and stepsizes.

11.1 Learning algorithms and stepsizes.

11.2 Deterministic stepsize recipes.

11.3 Stochastic stepsizes.

11.4 Optimal stepsizes for nonstationary time series.

11.5 Optimal stepsizes for approximate value iteration.

11.6 Convergence.

11.7 Guidelines for choosing stepsize formulas.

11.8 Bibliographic notes.

Problems.

12. Exploration vs. exploitation.

12.1 A learning exercise: the nomadic trucker.

12.2 An Introduction to learning.

12.3 Heuristic learning policies.

12.4 Gittins indices for online learning.

12.5 The knowledge gradient policy.

12.6 Learning with a physical state.

12.7 Bibliographic notes.

Problems.

13. Value function approximations for resource allocation problems.

13.1 Value functions versus gradients.

13.2 Linear approximations.

13.3 Piecewise linear approximations.

13.4 Solving a resource allocation problem using piecewise linear functions.

13.5 The SHAPE algorithm.

13.6 Regression methods.

13.7 Cutting planes.

13.8 Why does it work?

13.9 Bibliographic notes.

Problems.

14. Dynamic resource allocation problems.

14.1 An asset acquisition problem.

14.2 The blood management problem.

14.3 A portfolio optimization problem.

14.4 general resource allocation problem.

14.5 A fleet management problem.

14.6 A driver management problem.

14.7 Bibliographic references.

Problems.

15. Implementation challenges.

15.1 Will ADP work for your problem?

15.2 Designing an ADP algorithm for complex problems.

15.3 Debugging an ADP algorithm.

15.4 Practical Issues.

15.5 Modeling your problem.

15.6 Online vs. offline models.

15.7 If it works, patent it!

Index.

主題書展

更多

優惠方式：滿600現折50

從「要我讀」到「我要讀」，小學生愛不釋手的書籍推薦

優惠方式：熱賣中

優惠方式：5折起

2026台北說展精選參展外文書

優惠方式：2折起

優惠方式：熱賣中

優惠方式：79折起

Step into Reading 讀本展

優惠方式：單75五69

Pengun Classic 經典文學展

優惠方式：75折起

優惠方式：1折起!!

優惠方式：熱賣中

主題書展

更多書展

優惠方式：滿600現折50

從「要我讀」到「我要讀」，小學生愛不釋手的書籍推薦

優惠方式：熱賣中

優惠方式：5折起

2026台北說展精選參展外文書

優惠方式：2折起

優惠方式：熱賣中

優惠方式：79折起

Step into Reading 讀本展

優惠方式：單75五69

Pengun Classic 經典文學展

優惠方式：75折起

優惠方式：1折起!!

優惠方式：熱賣中

購物須知

外文書商品之書封，為出版社提供之樣本。實際出貨商品，以出版社所提供之現有版本為主。部份書籍，因出版社供應狀況特殊，匯率將依實際狀況做調整。

無庫存之商品，在您完成訂單程序之後，將以空運的方式為你下單調貨。為了縮短等待的時間，建議您將外文書與其他商品分開下單，以獲得最快的取貨速度，平均調貨時間為1~2個月。

為了保護您的權益，「三民網路書店」提供會員七日商品鑑賞期(收到商品為起始日)。

若要辦理退貨，請在商品鑑賞期內寄回，且商品必須是全新狀態與完整包裝(商品、附件、發票、隨貨贈品等)否則恕不接受退貨。

優惠價：90 5333

若需訂購本書，請電洽客服 02-25006600[分機130、131]。

暢銷榜
客服中心
收藏
瀏覽紀錄
會員專區

三民書局

圖書採購/編目

禮券兌換處

瀏覽器資訊

三民網路書店服務

會員服務條款

資訊安全警語

隱私權政策

中國圖書館分類

空中大學購書

企業會員專區

加入企業會員

圖書目錄下載

三民・東大・弘雅三民

小山丘童書(0-6歲)

兒童・青少年(7歲以上)

古籍圖書目錄

古典圖書目錄