Differential equation based framework for deep reinforcement learning:
Gespeichert in:
Beteilige Person: | |
---|---|
Format: | Hochschulschrift/Dissertation Buch |
Sprache: | Englisch |
Veröffentlicht: |
Stuttgart
Fraunhofer Verlag
[2021]
|
Schlagwörter: | |
Links: | http://deposit.dnb.de/cgi-bin/dokserv?id=ba6f57f463454af88bbd5791c954a581&prov=M&dok_var=1&dok_ext=htm https://d-nb.info/1227947135/04 http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032737289&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
Umfang: | 130 Seiten Illustrationen, Diagramme 21 cm |
ISBN: | 9783839616826 3839616824 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV047334726 | ||
003 | DE-604 | ||
005 | 20210730 | ||
007 | t| | ||
008 | 210617s2021 gw a||| m||| 00||| eng d | ||
015 | |a 21,N09 |2 dnb | ||
015 | |a 21,A19 |2 dnb | ||
015 | |a 21,H06 |2 dnb | ||
016 | 7 | |a 1227947135 |2 DE-101 | |
020 | |a 9783839616826 |c Broschur : EUR 65.00 (DE), EUR 66.90 (AT), CHF 100.20 (freier Preis) |9 978-3-8396-1682-6 | ||
020 | |a 3839616824 |9 3-8396-1682-4 | ||
024 | 3 | |a 9783839616826 | |
028 | 5 | 2 | |a Bestellnummer: fhg-twm_47 |
035 | |a (OCoLC)1240359531 | ||
035 | |a (DE-599)DNB1227947135 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
044 | |a gw |c XA-DE-BW | ||
049 | |a DE-29T |a DE-634 | ||
084 | |a 510 |2 23sdnb | ||
084 | |a 004 |2 23sdnb | ||
100 | 1 | |a Gottschalk, Simon |d 1990- |e Verfasser |0 (DE-588)1231111712 |4 aut | |
245 | 1 | 0 | |a Differential equation based framework for deep reinforcement learning |c Simon Gottschalk |
264 | 1 | |a Stuttgart |b Fraunhofer Verlag |c [2021] | |
300 | |a 130 Seiten |b Illustrationen, Diagramme |c 21 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
502 | |b Dissertation |c Technische Universität Kaiserslautern |d 2020 | ||
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Differentialgleichung |0 (DE-588)4012249-9 |2 gnd |9 rswk-swf |
653 | |a Deep Reinforcement Learning | ||
653 | |a Optimal control | ||
653 | |a Necessary optimality conditions | ||
653 | |a Machine Learning | ||
653 | |a Applied Mathematics | ||
653 | |a Optimization | ||
653 | |a Mathematiker | ||
653 | |a Informatiker | ||
653 | |a Data Scientists | ||
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | 1 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 2 | |a Differentialgleichung |0 (DE-588)4012249-9 |D s |
689 | 0 | |5 DE-604 | |
710 | 2 | |a Fraunhofer-Institut für Techno- und Wirtschaftsmathematik |0 (DE-588)10126139-1 |4 isb | |
710 | 2 | |a Fraunhofer IRB-Verlag |0 (DE-588)4786605-6 |4 pbl | |
856 | 4 | 2 | |m X:MVB |q text/html |u http://deposit.dnb.de/cgi-bin/dokserv?id=ba6f57f463454af88bbd5791c954a581&prov=M&dok_var=1&dok_ext=htm |3 Inhaltstext |
856 | 4 | 2 | |m B:DE-101 |q application/pdf |u https://d-nb.info/1227947135/04 |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032737289&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-032737289 |
Datensatz im Suchindex
_version_ | 1819346155671126016 |
---|---|
adam_text | CONTENTS
INTRODUCTION
13
1
OPTIMAL
CONTROL
PROBLEMS
AND
CORRESPONDING
SOLUTION
METHODS
19
1.1
INTRODUCTION
BASED
ON
FUNCTIONAL
ANALYSIS
.............................................
19
1.2
OPTIMAL
CONTROL
PROBLEMS
..............................................................................
21
1.3
SOLUTION
STRATEGIES
..........................................................................................
23
2
REINFORCEMENT
LEARNING 27
2.1
THE
MARKOV
DECISION
PROCESS
.......................................................................
27
2.2
BASIC
CONCEPTS
AND
TECHNIQUE
CLASSES
.......................................................
32
2.2.1
THE
WORKING
PRINCIPLE
.......................................................................
34
2.2.2
VALUE
FUNCTION
BASED
REINFORCEMENT
LEARNING
..............................
35
2.2.3
POLICY
BASED
REINFORCEMENT
LEARNING
..............................................
38
2.2.4
ACTOR-CRITIC
REINFORCEMENT
LEARNING
..............................................
44
2.2.5
MODEL-BASED
REINFORCEMENT
LEARNING
..............................................
45
2.3
COMPARISON
OF
OPTIMIZATION
CONCEPTS
.......................................................
46
2.4
DEEP
REINFORCEMENT
LEARNING
.......................................................................
48
2.4.1
THE
NEURAL
NETWORK
..........................................................................
48
2.4.2
THE
VALUE
FUNCTION
.............................................................................
52
2.4.3
THE
POLICY
.............................................................................................
54
3
THE
DIFFERENTIAL
EQUATION
BASED
FRAMEWORK
59
3.1
DERIVATION
OF
THE
INFINITE
DIMENSIONAL
OPTIMIZATION
PROBLEM
....................................................................................
59
3.2
THE
NECESSARY
OPTIMALITY
CONDITIONS
..........................................................
65
3.2.1
SHOOTING
METHOD
FOR
THE
OPTIMALITY
CONDITIONS
..........................
73
3.2.2
GRADIENT
METHOD
IN
FUNCTION
SPACE
.................................................
76
3.3
REINFORCEMENT
LEARNING
TECHNIQUE
BASED
ON
NECESSARY
OPTIMALITY
CONDITIONS
..........................................................................................................
77
3.3.1
THE
MODEL-FREE
ALGORITHM
.................................................................
77
3.3.2
COMPARISON
TO
REINFORCE
..........................................................
79
3.3.3
FREE
TERMINAL
PSEUDO
TIME
.............................................................
82
3.4
FLEXIBLE
NETWORK
STRUCTURE
BY
ADAPTIVE
STEP
SIZES
..................................
84
3.5
FINITE
STATE
SPACE
..........................................................................................
86
3.6
EN-
AND
DECODER
.............................................................................................
88
9
4
APPLICATIONS
91
4.1
THE
MOVING
POINT
IN
THE
2D
PLANE
..............................................................
92
4.2
ROBOT
.................................................................................................................
96
4.3
HUMAN
ARM
.....................................................................................................
107
CONCLUSION
&
OUTLOOK
116
LIST
OF
ABBREVIATIONS
117
10
|
any_adam_object | 1 |
author | Gottschalk, Simon 1990- |
author_GND | (DE-588)1231111712 |
author_facet | Gottschalk, Simon 1990- |
author_role | aut |
author_sort | Gottschalk, Simon 1990- |
author_variant | s g sg |
building | Verbundindex |
bvnumber | BV047334726 |
ctrlnum | (OCoLC)1240359531 (DE-599)DNB1227947135 |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02699nam a2200649 c 4500</leader><controlfield tag="001">BV047334726</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20210730 </controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">210617s2021 gw a||| m||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">21,N09</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">21,A19</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">21,H06</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">1227947135</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783839616826</subfield><subfield code="c">Broschur : EUR 65.00 (DE), EUR 66.90 (AT), CHF 100.20 (freier Preis)</subfield><subfield code="9">978-3-8396-1682-6</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">3839616824</subfield><subfield code="9">3-8396-1682-4</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">9783839616826</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">Bestellnummer: fhg-twm_47</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1240359531</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB1227947135</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">gw</subfield><subfield code="c">XA-DE-BW</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29T</subfield><subfield code="a">DE-634</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">510</subfield><subfield code="2">23sdnb</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">004</subfield><subfield code="2">23sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Gottschalk, Simon</subfield><subfield code="d">1990-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1231111712</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Differential equation based framework for deep reinforcement learning</subfield><subfield code="c">Simon Gottschalk</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Stuttgart</subfield><subfield code="b">Fraunhofer Verlag</subfield><subfield code="c">[2021]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">130 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield><subfield code="c">21 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="b">Dissertation</subfield><subfield code="c">Technische Universität Kaiserslautern</subfield><subfield code="d">2020</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Differentialgleichung</subfield><subfield code="0">(DE-588)4012249-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Deep Reinforcement Learning</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Optimal control</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Necessary optimality conditions</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Machine Learning</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Applied Mathematics</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Optimization</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Mathematiker</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Informatiker</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Data Scientists</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Differentialgleichung</subfield><subfield code="0">(DE-588)4012249-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Fraunhofer-Institut für Techno- und Wirtschaftsmathematik</subfield><subfield code="0">(DE-588)10126139-1</subfield><subfield code="4">isb</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Fraunhofer IRB-Verlag</subfield><subfield code="0">(DE-588)4786605-6</subfield><subfield code="4">pbl</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">X:MVB</subfield><subfield code="q">text/html</subfield><subfield code="u">http://deposit.dnb.de/cgi-bin/dokserv?id=ba6f57f463454af88bbd5791c954a581&prov=M&dok_var=1&dok_ext=htm</subfield><subfield code="3">Inhaltstext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">B:DE-101</subfield><subfield code="q">application/pdf</subfield><subfield code="u">https://d-nb.info/1227947135/04</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032737289&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032737289</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV047334726 |
illustrated | Illustrated |
indexdate | 2024-12-20T19:16:36Z |
institution | BVB |
institution_GND | (DE-588)10126139-1 (DE-588)4786605-6 |
isbn | 9783839616826 3839616824 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032737289 |
oclc_num | 1240359531 |
open_access_boolean | |
owner | DE-29T DE-634 |
owner_facet | DE-29T DE-634 |
physical | 130 Seiten Illustrationen, Diagramme 21 cm |
publishDate | 2021 |
publishDateSearch | 2021 |
publishDateSort | 2021 |
publisher | Fraunhofer Verlag |
record_format | marc |
spellingShingle | Gottschalk, Simon 1990- Differential equation based framework for deep reinforcement learning Maschinelles Lernen (DE-588)4193754-5 gnd Datenanalyse (DE-588)4123037-1 gnd Differentialgleichung (DE-588)4012249-9 gnd |
subject_GND | (DE-588)4193754-5 (DE-588)4123037-1 (DE-588)4012249-9 (DE-588)4113937-9 |
title | Differential equation based framework for deep reinforcement learning |
title_auth | Differential equation based framework for deep reinforcement learning |
title_exact_search | Differential equation based framework for deep reinforcement learning |
title_full | Differential equation based framework for deep reinforcement learning Simon Gottschalk |
title_fullStr | Differential equation based framework for deep reinforcement learning Simon Gottschalk |
title_full_unstemmed | Differential equation based framework for deep reinforcement learning Simon Gottschalk |
title_short | Differential equation based framework for deep reinforcement learning |
title_sort | differential equation based framework for deep reinforcement learning |
topic | Maschinelles Lernen (DE-588)4193754-5 gnd Datenanalyse (DE-588)4123037-1 gnd Differentialgleichung (DE-588)4012249-9 gnd |
topic_facet | Maschinelles Lernen Datenanalyse Differentialgleichung Hochschulschrift |
url | http://deposit.dnb.de/cgi-bin/dokserv?id=ba6f57f463454af88bbd5791c954a581&prov=M&dok_var=1&dok_ext=htm https://d-nb.info/1227947135/04 http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032737289&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT gottschalksimon differentialequationbasedframeworkfordeepreinforcementlearning AT fraunhoferinstitutfurtechnoundwirtschaftsmathematik differentialequationbasedframeworkfordeepreinforcementlearning AT fraunhoferirbverlag differentialequationbasedframeworkfordeepreinforcementlearning |