Given the update from Mercedes yesterday, I thought the following patent from Denso was very relevant. It talks about using an automotive assistant that recognises keywords such as
"Hey Mercedes" and
"Hey BMW". This indicates BMW may be another car destined to have Akida embedded.We also know Denso appeared to have a solid relationship with both Brainchip and Toyota at the Edge Computing World 2021 presentation (see video linked to this thread). Whether Toyota is potentially using Akida for a similar application, time will tell.
As per the Denso patent below, there are multiple methods of authenticating the driver. There are microphone only, camera AND microphone, as well as camera, microphone AND mouth-reading systems. If this is what Mercedes are using on their concept car, I'd wager they are going with the most advanced setup to demonstrate the most cutting edge features on their high end car. Hence also the description of "1.2 million neurons", and not just say "2 cores". Denso / Mercedes are probably the customer behind the Brainchip driver recognition video.
Pure speculation, DYOR
https://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1&f=G&l=50&d=PG01&p=1&S1=20210105619&OS=20210105619&RS=20210105619 | United States Patent Application | 20210105619 |
---|
1 | Kind Code | A1 |
---|
2 | KASHANI; Ameer ; et al. | April 8, 2021 |
---|
SYSTEM AND METHOD FOR AUTHENTICATING AN OCCUPANT OF A VEHICLEAbstractA system in a vehicle includes one or more sensors configured to obtain occupant information from an occupant utilizing at least facial information of the occupant. The system also includes a controller in communication with the one or more sensors. The controller is configured to determine an application policy associated with one or more applications of the vehicle and execute the one or more applications in response to facial information exceeding a first authentication layer or second authentication layer associated with the application policy.
| Inventors: | KASHANI; Ameer;(Southfield, MI); IYER; Gopalakrishnan;(Santa Clara, CA) |
---|
Applicant: | Name | City | State | Country | Type |
---|
DENSO Corporation | Kariya |
| JP |
|
|
---|
Family ID: | 74875537 |
Appl. No.: | 16/594389 |
Filed: | October 7, 2019 |
Current U.S. Class: | 1/1 |
Current CPC Class: | G06K 9/00832 20130101; H04W 12/35 20210101; G06K 9/00335 20130101; G10L 15/25 20130101; H04W 4/40 20180201; H04W 12/06 20130101; G06F 21/32 20130101; H04L 63/0861 20130101; H04L 63/102 20130101; G06F 21/629 20130101; H04W 4/38 20180201; H04L 63/20 20130101; H04W 12/60 20210101; H04L 2463/082 20130101 |
International Class: | H04W 12/06 20060101 H04W012/06; H04L 29/06 20060101 H04L029/06; G06F 21/32 20060101 G06F021/32; G06F 21/62 20060101 G06F021/62; H04W 12/00 20060101 H04W012/00; H04W 4/38 20060101 H04W004/38; H04W 4/40 20060101 H04W004/40; G06K 9/00 20060101 G06K009/00; G10L 15/25 20060101 G10L015/25 |
Claims1. A system in a vehicle, comprising: one or more sensors configured to obtain occupant information from an occupant utilizing at least facial information of the occupant; and a controller in communication with the one or more sensors, wherein the controller is configured to: determine an application policy associated with one or more applications of the vehicle; and execute the one or more applications in response to facial information exceeding a first authentication layer or second authentication layer associated with the application policy.2. The system of claim 1, wherein the one or more sensors are configured to obtain occupant information from at least voice information of the occupant in the vehicle.3. The system of claim 2, wherein the system is configured to identify the occupant utilizing at least the facial information and voice information.4. The system of claim 3, wherein the facial information includes mouth-movement data associated with the occupant.5. The system of claim 1, wherein the controller is further configured to prevent execution of the one or more applications in response to the facial information.6. The system of claim 1, wherein the controller is further configured to deactivate operation of the one or more applications in response to facial information.7. The system of claim 1, wherein the system is configured to obtain the occupant information from at least the facial information in a reoccurring period of operation of the vehicle.8. The system of claim 1, wherein the system further includes a wireless transceiver in communication with a mobile device and the controller is further configured to identify the occupant utilizing at least the mobile device.9. The system of claim 8, wherein the controller is configured to execute the one or more applications in response to the occupant.10. A system in a vehicle, comprising: one or more sensors configured to obtain occupant information from one or more occupants utilizing at least facial information of the one or more occupants; a wireless transceiver in communication with a mobile device; and a controller in communication with the one or more sensors and the wireless transceiver, wherein the controller is configured to: identify the occupant from at least the facial information and the mobile device; determine an application policy associated with one or more applications, wherein the application policy is associated with at least a first authentication layer and a second authentication layer; and execute the one or more applications in response to facial information exceeding the first authentication layer or the second authentication layer associated with the application policy.11. The system of claim 10, wherein the controller is configured to obtain the facial information from the one or more occupants in a cyclical manner over a threshold period.12. The system of claim 10, wherein the controller is further configured to determine the application policy from the one or more applications.13. The system of claim 10, wherein the controller is further configured to execute the one or more applications in response to voice information of the one or more occupants exceed the first authentication layer or second authentication layer associated with the application policy.14. The system of claim 10, wherein the controller is configured to obtain facial information from the one or more occupants in a cyclical manner over a threshold period.15. The system of claim 10, wherein the one or more sensors are configured to obtain occupant information from at least voice recognition data of the one or more occupants in the vehicle.16. The system of claim 10, wherein the first authentication layer defines one attribute associated with the occupant information, and the second authentication layer defines more than one attribute associated with the occupant information that includes a time stamp.17. The system of claim 16, wherein the system is configured to deactivate operation of the system in the vehicle in response to voice recognition data of the one or more occupants in the vehicle.18. The system of claim 16, wherein the system is configured to block access to an application in response to the facial information falling below the first authentication layer or the second authentication layer.19. A method in a vehicle, comprising: obtaining facial information and voice information from an occupant utilizing at least a camera and microphone in the vehicle; identifying the occupant utilizing at least the facial information and the voice information; determining an application policy associated with one or more applications in response to the identification of the occupant; and executing the one or more applications in response to facial information and voice information exceeding a first authentication layer or second authentication layer associated with the application policy.20. The method of claim 19, wherein the method further includes blocking access to the one or more applications in response to the facial information falling below the first authentication layer or the second authentication layer.DescriptionTECHNICAL FIELD[0001] The present disclosure relates to occupant authentication in a vehicle.BACKGROUND[0002] Vehicle systems may authenticate occupants in a vehicle to make sure the appropriate occupants are operating the appropriate features in the vehicle. In another example, vehicle systems may be utilized to prevent cyber-attacks carried out on the vehicle. For example, voice recognition systems may be susceptible to "dolphin attacks," which may be an attack hidden by high-frequency sounds that our voice assistants can detect, but the human ear cannot hear.SUMMARY[0003] According to one embodiment, a system in a vehicle includes one or more sensors configured to obtain occupant information from an occupant utilizing at least facial information of the occupant. The system also includes a controller in communication with the one or more sensors. The controller is configured to determine an application policy associated with one or more applications of the vehicle and execute the one or more applications in response to facial information exceeding a first authentication layer or second authentication layer associated with the application policy.[0004] According to a second embodiment, a system in a vehicle includes one or more sensors configured to obtain occupant information from one or more occupants utilizing at least facial information of the occupant, a wireless transceiver in communication with a mobile device, and a controller in communication with the one or more sensors and the wireless transceiver. The controller is configured to identify the occupant from at least the facial information and the mobile device, determine an application policy associated with one or more applications, wherein the application policy is associated with at least a first authentication layer and a second authentication layer, execute the one or more applications in response to facial information exceeding the first authentication layer or second authentication layer associated with the application policy.[0005] According to a third embodiment, a method in a vehicle includes obtaining facial information and voice information from an occupant utilizing at least a camera and microphone in the vehicle, identifying the occupant utilizing at least the facial information and the voice information, determining an application policy associated with one or more applications in response to the identification of the occupant, and executing the one or more applications in response to facial information and voice information exceeding a first authentication layer or second-authentication layer associated with the application policy.BRIEF DESCRIPTION OF THE DRAWINGS[0006] FIG. 1 illustrates an example block diagram of a vehicle system 100.[0007] FIG. 2 illustrates an exemplary flowchart of a user authentication system 200 in one embodiment.[0008] FIG. 3 illustrates an exemplary table 300 that may be utilized to authenticate various commands.DETAILED DESCRIPTION[0010] Voice recognition systems are becoming more popular, especially in a vehicle environment. Voice recognition systems and virtual assistants are becoming more personalized and tailored to individual users. Voice recognition systems may automatically recognize individual users by extracting distinctive features related to their acoustic patterns. Voice recognition systems may utilize a "wake word," such as "OK Google," "Hey Siri," or "Alexa" and subsequently process natural language requests following from an individual user. Similarly, automotive voice assistants integrated into in-vehicle infotainment systems may utilize wake words such as "Hey Mercedes" or "Hey BMW". Virtual assistants may process requests from recognized users and provide access to privileged functionalities or authorize critical system operations. For example, after recognizing an enrolled user, a virtual assistant may dynamically respond to the user's speech by modifying, sharing or transmitting personal information, conducting financial transactions, creating new appointments, adjusting vehicle parameters such as speed, destination, or reconfiguring other critical in-vehicle features/services, and so forth. With the growing diffusion of self-driving vehicles and ride-sharing services, passengers are increasingly sharing transportation with unknown parties. Relying on acoustic pattern feature extraction to recognize individual users may be prone to inaccuracies and leave virtual assistants susceptible to cyberattacks such as spoofing. For example, a malicious actor may impersonate the voice of another vehicle occupant by capturing and replaying it directly, or use a dataset of samples captured a priori to train a Generative Adversarial Network (GAN) and produce compelling, arbitrary speech with the victim's acoustic pattern. Researchers have found a way to utilize ultrasonic audio commands that voice recognition systems can hear, but humans cannot to covertly control virtual assistants. Some cyberattacks may translate a standard human voice command and broadcast the translation into ultrasonic frequencies, sometimes called a "dolphin attack." Such frequencies are not possible for humans to hear, but voice recognition systems may still utilize the ultrasonic commands. It may be beneficial for vehicle systems to combat such attacks.
[0011] Occupant identity may be continuously classified using various techniques, such as facial recognition, acoustic patterns, biometric profiles, behavioral models, association of a driver or user via a key fob or mobile device, association of a driver or user via a shared secret such as a passphrase or PIN code, analysis of vehicle seat settings, body weight information collected from seat sensors, etc. In one embodiment, when an application feature requires a higher level of assurance, it may request occupant state information from an occupant monitoring system (OMS). In response to the request, the OMS may transmit the occupant state information continually to the application. An application may have a requisite occupant state in order to access certain features of the application. The application may also correlate critical features of the application with the occupant state information to determine implausibility and deviations from expected behavior caused by the presence of a fault or cyberattack. The application may be preconfigured with validation instructions as part of a data validation engine.
[0013] The controller 101 may be in communication with various sensors, modules, and vehicle systems both within and remote of the vehicle 102. The vehicle system 100 may include various sensors, such as various cameras, a LIDAR sensor, a radar sensor, an ultrasonic sensor, or other sensor for detecting information about the surroundings of the vehicle, including, for example, other vehicles, lane lines, guard rails, objects in the roadway, buildings, pedestrians, etc. In the example shown in FIG. 1, the vehicle system 100 may include a camera 103 and a transceiver 105. The vehicle system 100 may also include a microphone, a global positioning system (GPS) module, a human-machine interface (HMI) display (not shown), as well as other sensors, controllers, and modules. FIG. 1 is an example system and the vehicle system 100 may include more or less sensors, and of varying types. The vehicle system 100 may be equipped with additional sensors at different locations within or on the vehicle 102 and/or remote of the vehicle 102, including additional sensors of the same or different type. As described below, such sensors may collect sensor data 106. The sensor data 106 may include any data collected by various sensors. Sensor data 106 may include image data, GPS data, vehicle speed data, vehicle acceleration data, voice recognition data, facial recognition data, biometric data, or any other data collected by various sensors or processors in the vehicle 102. Sensor fusion may be applied to the sensor data 106 to aggregate information, observe user interactions, build operational contexts, determine occupant state, identification of the occupant, and other items such as vehicle system usage patterns, objects handled by occupants, and so forth. Sensor fusion may occur in response to software that combines data from several sensors to improve an application or performance of a system or subsystem. Combining data from multiple sensors may correct for the deficiencies of the data collected by a specific type of individual sensor. Thus, sensor fusion may allow calculation of more accurate information. For example, if facial recognition data is utilized alone, it may have difficulty identifying an individual apart from another individual (e.g., one twin from another twin, etc.). By adding voice recognition data, the vehicle system 100 may have a higher probability of correctly identifying the individual from another individual.
[0017] The camera 103 may be mounted in the vehicle 102 to monitor occupants (e.g., a driver or passenger) within a passenger cabin of the vehicle 102. The camera 103 may be part of an occupant monitoring system (OMS) 104. The camera 103 may be utilized to capture images of an occupant in the vehicle 102. The camera 103 may obtain facial information of an occupant, such as eye-movement, mouth-movement, and head-movement, as discussed further below. The camera 103 may be, for example, a color camera, infrared camera, radar/ultrasonic imaging camera, or time of flight camera. The camera 103 may be mounted on a head rest, on the dashboard, in the headliner, or in any other suitable location. Additionally or alternatively, the camera 103 may be located on a mobile device (e.g., tablet or mobile phone) and may capture the occupant's (e.g., driver or passenger) face, torso, limbs, eyes, mouth, etc.
[0018] The controller 101 may receive occupant information from the OMS 104 to determine an abnormal situation within the vehicle 102. The OMS 104 may employ one or more activity sensors such as a driver-facing camera, a passenger-facing camera, a health scanner, and an instrument panel to monitor activities performed by the occupants (e.g., driver or passenger). Based on the activity sensors, the OMS 104 may determine whether the driver is, for example, distracted, sick, or drowsy as the abnormal situation. For example, a passenger-facing camera may be employed in a vehicle headliner, vehicle headrest, or other area of the vehicle 102 to monitor activity of the passenger. The OMS 104 may also employ a microphone that is in communication with a voice recognition (VR) engine that can capture voice information of an occupant. The voice information may be utilized for voice commands in a voice recognition session. Based on the various sensors, the OMS 104 may determine whether the occupant is (e.g., driver or passenger), for example, fuss, experiencing motion sickness, hunger, fever, etc.
[0019] In another example, the OMS 104 may include a health scanner mounted on a seat of the vehicle 102, to a child seat, or another suitable location which the occupant touches or is positioned in the line of sight thereof. The health scanner may scan the occupant's heartbeat, blood pressure, pulse, or other health related information. The OMS 104 processes data received from the health scanner and monitors whether the occupant is suffering from a severe physical condition or episode. The OMS 104 may also be utilized with the health scanner to determine if various fluctuations in data may identify stress or issues with the occupant.
[0020] The vehicle system 100 may also include one or more external cameras located on the vehicle 102. The external camera may be mounted to the rear-view mirror, side-view mirrors, doors, fenders, roof/pillars, or bumpers either independently or in conjunction with another external vehicle component such as illumination devices, ornamental objects, or handles, etc. The external camera may also be facing out of the vehicle cabin through a vehicle's windshield to collect imagery data of the environment in front of the vehicle 102. The external camera may be utilized to collect information and data regarding the front of the vehicle 102 and for monitoring the conditions ahead of the vehicle 102. The camera may also be used for imaging the conditions ahead of the vehicle 102 and correctly detecting the position of lane markers as viewed from the position of the camera and the presence/absence, for example, of lighting of the head lights of oncoming vehicles. For example, the external camera may be utilized to generate image data related to vehicle's surrounding the vehicle 102, lane markings ahead, and other object detection. The vehicle 102 may also be equipped with a rear camera (not shown) for similar circumstances, such as monitoring the vehicle's environment around the rear proximity of the vehicle 102. When equipped with more than one external camera, the vehicle 102 may combine individual fields of view to provide a collective field of view and may also stream the imagery in real-time to local or remote consumers. In another example, the OMS 104 may share information by sending messages directly to an application module 110 or indirectly by populating a local/remote database connected to the application module 110. The shared information may include time-indexed imagery data (e.g., including a time stamp) along with specific data corresponding to detected events/conditions, such as the occupant events previously described.
[0021] In another embodiment, the vehicle system 100 may be equipped with a sound identification device (e.g., microphone). The microphone may determine a probability that the sound data corresponds to a pre-defined sound or sound model based on a subset of temporal parameters. For example, the microphone may apply an algorithm (e.g., trained deep-neural-network) to determine if an occupant event has occurred. The algorithm may take a number of inputs corresponding to the number of temporal parameters. Each acoustic feature vector may include a number of features and temporal parameters that are determined for each acoustic feature. Of course, in other embodiments, the number of parameters may vary. The deep-neural-network algorithm of the illustrative microphone may have previously been trained using machine learning in order to accurately determine if the sound data matches a pre-defined sound. The deep-neural-network algorithm may employ a softmax layer, backpropagation, cross-entropy optimization, and reinforcement learning as part of the training. This training may include supplying samples of sounds that match the pre-defined sound and samples of sounds that do not match the pre-defined sound, such as sounds similar to expected background noise. For example, if the pre-defined sound is an infant crying, the algorithm may be provided with a number of samples of infants crying as well as sounds similar to expected background noise such as adult conversation, road traffic noise, and other vehicle sounds. In some embodiments, the microphone may determine whether the sound corresponds to several different pre-defined sounds, such as a "wake word," such as "OK Google," "Hey Siri," or "Alexa". In other embodiments, the microphone may perform automated speech recognition (ASR) to transcribe occupant (e.g., driver or passenger) commands or conversations for consumption by vehicle services (e.g., services provided by an application module 110). The microphone may also allow users to register their "voiceprint" and perform automated recognition of such users by correlating acoustic features learned by the aforementioned algorithm.