Fall Detection

Zhong Zhang and Vassilis Athitsos



The goal of a fall detection system is to automatically detect cases where a human falls and may have been injured. A natural application of such a system is in home monitoring of patients and elderly persons, so as to automatically alert relatives and/or authorities in case of an injury caused by a fall. We propose a statistical method based on Kinect that makes a decision based on the last few frames, by considering the number of frames during which the person has been falling, the magnitude of the fall, the maximum velocity of the fall and the rate of decrease frame-by-frame during the fall. Since the range of depth sensor is from 0:5m to 4m, one Kinect is not enough to cover the whole space. We set two Kinects in our home environment. Our user independent and camera independent test shows that our method is applicable in real life.

Experimental Scenario

The experiment data for this paper come from experiments run in the Heracleia Human Centered Computing Laboratory at the University of Texas at Arlington. In this lab, a simulated apartment has been set up. Two Kinects were set up at two corners of the apartment, and were set to monitor the apartment. The reason of setting two Kinects is that the range of depth sensor is from 0:5m to 4m, which means one Kinect is not enough to cover the whole apartment. The first row in the following figure is view 1 and the second row is view 2. The left side is depth map while the right side is color map.

Experimental Data

Six subjects do several actions in two scenes separately. Download the data set from the following links.

Alexis view1

Alexis view2

Pat view1

Pat view2

Rommel view1

Rommel view2

Soheil view1

Soheil view2

Weihua view1

Weihua view2

Zhong view1

Zhong view2

As for how to read the depth map, please use the following matlab file.

Read depth map file

These actions include real falls and other fall-like actions, such as picking up a coin from floor, sitting down on the floor, tying shoelaces and etc. There are 10400 frames and 12 real falls in scene 1 while 21214 frames and 14 real falls in scene 2. The following table shows fall-like actions in our experiment.

















In the above table, pf means picking up something from floor, ts means tying shoelaces, sb means sleeping down on the bed, sif means sitting on the floor, pd means opening the lower drawer, which is very close to the floor, jb means jumping on to the floor and sf means sleeping down on the floor.

The following figures show a fall process.


A fall process

We also annotate the start and end frame for every fall process.

Start and end frame file.

In the annotation file, the format is like: Alexis view1 202 215.
Alexis is the user name. view1 means scene 1. 202 is the start frame and 215 is the end frame. 


The following figures show a typical fall-like action, which is sitting on the floor.

Sitting on the floor