Key components Task/Objective (“Automated Driving to reach destination [here]”) Resources (state) (“sensors, fuel, etc.”) Uncertainties (“What in the world is happening”) Actions (“turn left”) In one line: an agent makes decisions via the balance of observation with uncertainty. This is called the observe-act cycle. See also connectionism Applications Stock shelving Automated driving Space missions Sports Congestion modeling Online dating Traffic light control decision making methods explicit programming: “just code it up” — try this first if you are building something, which should establish a baseline: guess all possible states, and hard code strategies for all of them supervised learning: manually solve representative states, hard code strategies for them, make model interpolate between them optimization: create optimization objective connected to a model of the environment, optimize that objective planning: using model of the environment directly to predict best moves reinforcement learning: make agent interact with environment directly, and optimize its score of success in the environment without a model Method Model Visible? Strategy Hard-Coded? explicit programming yes, all states fully known yes supervised learning no, only a sample of it yes, only a sample of it optimization no, except reward no planning yes no reinforcement learning history see decision making history

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?