Our research group works on human cyber-physical systems operating across geographically dispersed environments. Examples of such system include public transportation networks, and electrical power networks. We explore these societally important systems focusing on the community applications, descriptive and predictive analytics, and computation platforms and distributed middleware required for these systems. In particular, we are concerned with the following challenges.
Development and Operations Management - As we think about modern operations that include over the air updates and flexible functionality that are implemented in software, the devops mechanisms used to design enterprise systems are now being used for CPS design. The challenge we face is – how can we ensure that the software engineering principles applied to enterprise structures can be used to guarantee safety and reliability for embedded software that have to operate under the physical constraints imposed by CPS components?
System Reliability - Any system that operates over long periods of time has to cope with degradation associated with ageing, operational stress, and environmental conditions, that can result in failures of the associated physical components. Failures and latent bugs in the software add another source of degradation and failure, resulting in un-operational and compromised systems. It remains a challenge to understand and monitor degradation and failure caused by interactions between different subsystems of a large CPS that may operate across multiple physical domains.
Robust Computation Platforms - A computation platform must be designed to accommodate and integrate heterogeneous components, operate at multiple time-scales (e.g., real time, near real time, and , long-term), allow for dynamic resource allocation, while accommodating a variety of topologies including edge networks, and providing, safety, reliability, and security guarantees. Our goal is to implement a stringent layered architecture that ensures that the layers interact across safe interface sets, provide efficiency guarantees, and ensure that the faults from a layer propagates across the layer in a way that allows better characterization of failure dynamics.