In one embodiment, systems and methods are disclosed for evaluating autonomous driving vehicle (ADV) driving decisions. A driving scenario is selected, such as a route or destination or type of driving condition. ADV planning and control modules are turned off and do not control the ADV. As a user drives the ADV, sensors detect and periodically log a plurality of objects external to the ADV. Driving control inputs of the human driver are also logged periodically. An ADV driving decision module generates driving decisions with respect to each object detected by the sensors. The ADV driving decisions are logged, but are not used to control the ADV. An ADV driving decision is identified in the logs, and a corresponding human driving decision is extracted, graded, and compared to the ADV driving decision. The ADV driving decision can be graded using the logs and graded human driving decision.
展开▼