IBES - A Tool for Creating Instructions Based on Event Segmentation

IBES is a tool with which users can create multimedia, step-by-step instructions quickly by segmenting a video of a task into steps, adding text, and choosing appropriate pictures and overlays.

Augmented Reality Handbook (Dec. 2012, DFKI)

Online workflow monitoring (Dec. 2012, Leeds)

This video demonstrates the online workflow monitoring results on two of our test datasets, Nails & Screws (first test scenario with simple operations) and Labelling & Packaging (second test scenario with bimanual operations, more complex relations between objects in the workspace, more background clutter and greater variability). The workflow monitoring is based on pairwise spatiotemporal relations between objects in the workspace and the user's hands in a 3D global coordinate frame. Also, spatiotemporal relations between the user's joint configurations are taken into account. The underlying workflow model has been trained in a supervised learning step. The work is performed by the University of Leeds. The recognised objects in the workspace are shown as labels. Object recognition and tracking are based on the work of the University of Bristol.

Augmented Reality Display (Dec. 2012, CCG)

This video illustrates some of the augmented reality player functionalities. In reality, the user wears a see-through HMD and sees augmented information, such as arrows and regions of interest, registered in the real field of view. In stand-alone mode, the user steps through the workflow using voice commands. The workflow has therefore been segmented into a sequence of primitive events based on the workflow recovery functionalities in the COGNITO system. The corresponding overlaid information may be of several types: 3D models for location indication and actions/movement instructions, audio extended instructions complementing the textual information, videos illustrating the complete primitive event, or even still images clearly illustrating tools/objects referred to in the overlaid textual/audio instruction.

Inertial body tracking applied to sport analytics (Oct. 2012, DFKI)

The inertial upper-body motion tracking developed within COGNITO has been extended to a full body tracking system by adding two sensors on each leg. During the last project period, the system has been tested in different scenarios beyond the industrial context. This video demonstrates the tracking capabilities during climbing. A potential application could be sports analysis and coaching. Further research will focus on detecting the limbs and body parts in the head mounted camera and use this as additional egocentric information for supporting the body motion tracking. Also, the camera will provide translational motion information and enable a global positioning.

Monocular markerless hand tracking (Oct. 2012, DFKI)

The 26 DoF hand posture is tracked based on a combination of an efficient database search (template matching) and a local kinematic model-fitting step. The key features of this approach are a novel adaptive search tree database indexing and billboard morphing technique, which enable robust hand tracking from a single RGB camera in complex scenes. After a training phase, the adaptive approach runs at interactive frame rates in pre-seen environments.

Online Ergonomic Assessment in an Industrial Environment (Feb. 2012, UTC/DFKI/SF)

In close cooperation, team members from UTC, DFKI and SmartFactory have developed a system for online global ergonomic evaluation of a worker while performing a task. The system continuously estimates the worker's motions based on a body sensor network and derives global biomechanical scores using the ergonomic tool Rapid Upper Limb Assessment (RULA). Based on this, the user receives visual (through the HMD) and acoustic feedback in real-time. This permits the worker to modify his posture immediately to decrease his risk of a musculosceletal disorder.

3D Scene Modelling Using an RGB-D Sensor (Feb. 2012, Bristol)

Given the depth and colour images from an RGB-D sensor, a 3D textured occupancy grid map of the workspace is created. This can be used for tracking the camera relative to the workspace. Moreover, it can be used for segmenting the foreground, thus supporting the detection and tracking of relevant objects and the user's arms and hands.

Object Detection and Tracking Using an RGB-D Sensor (Feb. 2012, Bristol)

Based on the depth and colour images from an overhead RGB-D sensor, and the 3D occupancy grid map of the workspace, the sensor is tracked and outliers are segmented into foreground points. These foreground points are clustered into connected components based on 3D spatial proximity. Cluster-based tracking maintains trajectories of these connected components using spatial proximity and size similarity. For each unknown cluster, a 2D mask for the cluster is passed for image-based recognition. Objects are recognised from previously learnt hand-held tools and objects. Recognition is based on affine-invariant descriptors of edgelet constellations. The recognition is both fast and scalable.

Editing and Labeling a Workflow (Feb. 2012, CCG)

This video presents the editor for labeling a workflow that has been learnt from example executions by the workflow recovery component. The output of unsupervised learning is a workflow model consisting of clusters that represent primitive events. To edit, annotate and label these primitive events is the main purpose of the editor. The labeling process is supported by images captured during the learning process. It is also possible to add multimedia elements, such as 3D models or illustrations, which will be visualized as augmented reality information during workflow monitoring. Thus, the output of this editor is used by the AR player for information presentation in the HMD.

Augmented Reality Visualization (Feb. 2012, CCG)

This video shows how the augmented reality component presents information to the user. The information is displayed in a see-through HMD through the processing of instructions received from the workflow monitoring module. Besides sending the information for the current and next action, the workflow module also provides information about task accomplishment, which is visually fed back to the user: Successful feedback is shown on correct execution, while a blinking alert is shown, if the action was not correctly performed.

Inertial Upper Body Tracking under Magnetic Disturbances (Oct. 2011, DFKI)

This video shows a novel egocentric solution for visual-inertial upper-body motion tracking based on recursive filtering and model-based sensor fusion. Visual detections of the wrists in the images of a chest-mounted camera are used as substitute for the commonly used magnetometer measurements.
In contrast to currently available inertial motion capturing systems, the new method enables successful operation in real industrial environments, which often suffer from severe magnetic disturbances.
Hence, this approach is particularly interesting with regard to a smart user assistance system for industrial manipulation tasks, such as the one developed in the COGNITO project.

First COGNITO User Evalution (July 2011, SF)

This video shows the user tests which were conducted in the SmartFactory for the FP7-Project COGNITO in June 2011. The aim of the tests were to evaluate the capabilities of the COGNITO concept as well as the acceptance and the applicability of Head-mounted-Displays (HMDs) for industrial usage. The results of the studies help the COGNITO consortium to identify the strengths and weaknesses of the first developments, and they will support the next development stage in the COGNITO project.

Intertial Upper Body Tracker (Feb. 2011, DFKI)

This demonstrator shows how a network of inertial sensors can be used to track the upper body movements of the wearer.

The user of the Upper Body Tracker wears a network of five interconnected miniature inertial measurement units (IMUs). The IMUs comprise micro-electro-mechanical system (MEMS) accelerometers, gyroscopes, and magnetometers contained in a 18 gram 3x3x1.5 cm casing.

The measurements from the sensor network contains information about the relative motion of the sensors. With IMUs strategically attached to the body, information about the pose and motion of the upper limbs can be obtained. The available information is extracted by comparing the measurements to predictions based on a biomechanical model of the body. The model consists of rigid bodies and restricted joints. This is a simplified but sufficient description of the human body for the purpose. The fusion of the measurements and the model is done in an extended Kalman filter (EKF) which produces joint angles and kinematics to describe the pose and motion of monitored parts of the body.

Musculoskeletal Model of the Hand and Forearm (Feb. 2011, UTC)

Video 1

Figure 1

Figure 1

Video 2

The aim of UTC-CNRS research unit in the COGNITO project is to develop a musculoskeletal model of the hand and forearm in order to carry out a biomechanical off-line analysis of the data collected with the on-body sensor-networks. The model is based on the motion capture of real manual movements frequently used in industrial manual tasks (see Video 1).

Then, kinematical data from the motion capture session are used to guide a musculoskeletal model composed of 21 segments, 28 muscles and 20 joints providing 24 degrees of freedom (see Figure 1)

Finally, joint loads and muscle-tendon forces are computed by using a inverse-to-direct dynamics method during the simulation. An example of result is given in Video 2.

Thus, this model will be used not only to document the motion performed by the worker but also to monitor which muscle functions are used and how many joints are loaded. It allows providing major information to optimize and ensure the ergonomy and safety of industrial tasks by subsequently providing an adequate feedback.

Workflow induction and user guidance (Feb. 2011, Leeds)

In this video we demonstrate how a system can track a workers actions and show them how to complete the current construction of maintenance task.

By observing the positions of both the hands and tools, and classifying these configurations into recognised actions, a workflow can be automatically formed from repeated examples. By demonstrating the sequence and nature of the necessary component actions we can build a model that is able to adapt to a new user, completing the same task, and make explicit a previously implicit workflow sequence. By formalising the demonstrated knowledge of experienced workers a new user can be trained by the system, while existing users can be monitored to prevent safety critical deviations from the task at hand.

Current work at the University of Leeds activity analysis group aims to develop techniques that allow the automatic formation of more advanced workflow models and to increase the detection accuracy of a variety of industrial activities.