Emil Vassev on Formal Verification

||对话

Emil Vassev portraitDr. Emil Vassevreceived his M.Sc. in Computer Science (2005) and his Ph.D. in Computer Science (2008) fromConcordia University, Montreal, Canada. Currently, he is a research fellow atLero (the Irish Software Engineering Research Centre) atUniversity of Limerick, Ireland where he is leading the Lero’s participation in theASCENS FP7project and the Lero’s joint project with ESA on Autonomous Software Systems Development Approaches. His research focuses on knowledge representation and awareness for self-adaptive systems. A part from the main research, Dr. Vassev’s research interests include engineering autonomic systems, distributed computing, formal methods, cyber-physical systems and software engineering. He has published two books and over 100 internationally peer-reviewed papers. As part of his collaboration with NASA, Vassev has been awarded one patent with another one pending.

卢克Muehlhauser: 在 ”Swarm Technology at NASA: Building Resilient Systems,” you and your co-authors write that:

To increase the survivability of [remote exploration] missions, NASA [uses] principles and techniques that help such systems become more resilient…

…Practice has shown that traditional development methods can’t guarantee software reliability and prevent software failures. Moreover, software developed using formal methods tends to be more reliable.

When talking to AI scientists, I notice that there seem to be at least two “cultures” with regard to system safety. One culture emphasizes the limitations of systems that are amenable to (e.g.) formal methods, and advises that developers use traditional AI software development methods to build a functional system, and try to make it safe near the end of the process. The other culture tends to think that getting strong safety guarantees is generally only possible when a system is designed “from the ground up” with safety in mind. Most machine learning people I speak to seem to belong to the former culture, whereas e.g.Kathleen Fisherand other people working on safety-critical systems seem to belong to the latter culture.

Do you perceive these two cultures within AI? If so, does the second sentence I quoted from your paper above imply that you generally belong to the second culture?


Emil Vassev: Before going to the answer of your question, I need to clarify some of the basics of AI as I perceive it. Basically, AI depends on our ability to efficiently transfer knowledge to software-intensive systems. A computerized machine can be considered as one exhibiting AI when it has the basic capabilities to transfer data into context-relevant information and then that information into conclusions exhibiting knowledge. Going further, I can say that AI is only possible in the presence of artificial awareness, one by which we can transfer knowledge to machines. Artificial awareness entails much more than computerized knowledge, however. It must also incorporate means by which a computerized machine can perceive events and gather data about its external and internal worlds. Therefore, to exhibit awareness, intelligent systems must sense and analyze components as well as the environment in which they operate. Determining the state of each component and its status relative to performance standards, or service-level objectives, is therefore vital for an aware system. Such systems should be able to notice changes, understand their implications, and apply both pattern analysis and pattern recognition to determine normal and abnormal states. In other words, awareness is conceptually a product of representing, processing, and monitoring knowledge. Therefore, AI requires knowledge representation, which can be considered as a formal specification of the “brain” of an AI system. Moreover, to allow for learning, we must consider an open-world model of this “machine brain”.

Now, let’s to go back to your question. I think, both research “cultures” have their niche within AI. Both cultures lean towards the use of open-world modeling of the AI by using formal methods. The difference lies mainly in the importance of the safety requirements, which justifies both approaches. Note that AI is a sort of superior control mechanism that exclusively relies on the functionality of the system to both detect safety hazards and pursue safety procedures. Therefore, in all cases AI is limited by system functionality and systems designed “from the ground up with safety in mind” are presumably designed with explicit safety-related functionality, and thus, their AI is less limited when it comes to safety.

回答你的第二个问题,是的,我考虑我self as leaning towards the second “research culture”. For many NASA/ESA systems, safety is an especially important source of requirements. Requirements engineers can express safety requirements as a set of features and procedures that ensure predictable system performance under normal and abnormal conditions. Furthermore, AI engineers might rely on safety requirements to derive special self-* objectives controlling the consequences of unplanned events or accidents. Think about the self-* objectives as AI objectives driving the system in critical situations employing self-adaptive behavior. Safety standards might be a good source of safety requirements and consecutively on safety-related self-* objectives. Such self-* objectives may provide for fault-tolerance behavior, bounding failure probability, and adhering to proven practices and standards. Explicit safety requirements provide a key way to maintain safety-related knowledge within a machine brain of what is important for safety. In typical practice, safety-related AI requirements can be derived by a four-stage process:

  1. Hazard identification – all the hazards exhibited by the system are identified. A hazard might be regarded as a condition – situation, event, etc., that may lead to an accident.
  2. Hazard analysis – possible causes of the system’s hazards are explored and recorded. Essentially, this step identifies all processes, combinations of events, and sequences that can lead from a ‘normal’ or ‘safe’ state to an accident. Success in this step means that we now understand how the system can get to an accident.
  3. Identifying safety capabilities – a key step is to identify the capabilities (functionality) the system needs to have in order to perform its goals and remain safe. It is very likely that some of the capabilities have been already identified by for the purpose of other self-* objectives.
  4. Requirements derivation – once the set of hazards is known, and their causation is understood, engineers can derive safety requirements that either prevent the hazards occurring or mitigate the resulting accidents via self-* objectives.

卢克: As explainedhere,没有系统金宝博官方可以100%安全,部分原因是其正式的安全要求可能无法捕获系统的设计人员错过或不充分地建模的危险。您的直观感觉如何“额外”在系统安全方面的信心通常通过应用正式方法获得?金宝博官方例如,您可能会说空间探测器推出了大量的测试,但没有正式方法将有70%的机会成功完成任务,而已经测试过的空间探测器and正式验证(,因此,设计是不同的ent way, so as to be amenable to formal verification) would have a 90% chance of completing its mission successfully.

Naturally, such estimates depend greatly on the details of the specific system, and which properties were proved by the formal verification process, but I’m hoping you can say something about your intuitive sense as to how much safety is gained in exchange for the extra work of designing a system so precisely that it can be formally verified.


Emil:一般来说,正式的方法努力通过消除缺陷来构建软件右(,因此,可靠),例如,要求缺陷。正式的方法工具允许对要求和设计进行全面分析,并最终接近对系统行为的探索,包括故障情况。金宝博官方然而,良好的需求形式化主要取决于要求工程师的分析技能以及手中的正式方法的正确使用。因此,捕获或实现安全要求时可以引入错误。这可能是为什么主要原因,虽然在专用分析工具的能力方面有效,如定理普通和模型检查,但正式的方法实际上不会消除测试的需要。

关于安全要求,正式方法的应用只能加上安全性。即使我们假设正确的测试可以捕获我们可以通过正式验证捕获的所有安全漏洞,我们可以正确使用正式方法,我们可以随时提高要求的质量,最终得到更高效的测试用例。此外,可以使用正式的方法来创建正式规格,随后可用于自动测试案例。因此,为了换取额外的工作来正式指定系统的安全要求,您不仅可以正式验证和验证这些要求的可能性,还可以更有效地测试其实施。金宝博官方

So, to go back to your question, as you said 100% safety cannot be guaranteed, but when properly used, formal methods can significantly contribute to safety by not replacing, but complementing testing. The quantitative measure of how much safety can be gained with formal methods may be regarded in three aspects:

  1. Formal verification and validation allows for early detection of safety flaws, i.e., before implementation.
  2. High quality of safety requirements improves the design and implementation of these requirements.
  3. Formally specified safety requirements assist in the derivation and generation of efficient test cases.

To be more specific, although it really depends on the complexity of the system in question, my intuition is that these three aspects complement each other and together they may help us build a system with up to 99% safety guarantee.


卢克: You wrote about the difficulty of “good requirements formalization” — that is, the challenge of translating our intuitive ideas about how we want the system to behave into a logical calculus.

什么类型的需求是目前研究人员金宝博娱乐ly able to formalize for use in formally verified systems? What types of intuitively desirable requirements are we as-yet unable to figure out how to formalize?


Emil: Contemporary formal verification techniques (e.g., model checking) rely on state-transition models where objects or entities are specified with states they can be in and associated with functions that are performed to change states or object characteristics. Therefore, basically every system property that can be measured or quantified, or qualified as a function can be formalized for the needs of formal verification. Usually, the traditional types of requirements – functional and non-functional (e.g., data requirements, quality requirements, time constraints, etc.), are used to provide a specific description of functions and characteristics that address the general purpose of the system. The formal verification techniques use the formal specification of such requirements to check desired safety and liveness properties. For example, to specify safety properties of a system, we need to formalize “nothing bad will happen to the system”, which can be done via the formalization of non-desirable system states along with the formalization of behavior that will never lead the system to these states.

显然,明确定义的属性的形式化(例如,通过边界,数据范围,输出等表达的适当状态)是一项简单的任务。然而,正式化不确定性并不容易,例如,活力属性(最终发生的事情)。虽然模糊逻辑等概率理论有助于我们正规化“真理程度”,并处理近似结论,而是用确切的结论,由于巨大的海水爆炸问题,模糊控制系统的验证工具并不有效。金宝博官方此外,测试此类系统也不高效,只有,因为,其正确金宝博官方行为的统计证据可能不够。因此,任何需要逐步评估(或部分满意度,例如软目标)的财产都是困难的,往往是不可能正式化的“用于正式验证的系统”。金宝博官方

Other properties that are “intuitively desirable” (especially by AI) but still cannot be formalized today are human behavior and principles, related to cultural differences, ethics, feelings, etc. The problem is that with the formal approaches today we cannot express, for example, emotional bias as a meaningful system state.


卢克: Let’s work with a concrete, if hypothetical, example. Suppose a self-driving car’s navigation software was designed with formal verification in mind, at least for many core components of the software. I presume we are very far from being able to define a formal requirement that matches the intuitive meaning of “Don’t ever be in a state such that the car has injured a pedestrian.”

So, what requirementscouldwe formalize such that, if they were proven to be met by the car’s navigation software, would help us to feel increasingly confident that the car would never injure a pedestrian (while still allowing the car to engage in normal driving behavior)?


Emil: Again, this example should be regarded with the insight that “100 % safety is not possible”, especially when the system in question (e.g., a self-driving car) engages in interaction with a non-deterministic and open-world environment. What we should do though, to maximize the safety guarantee that “the car would never injure a pedestrian” is to determine all the critical situations involving the car itself in close proximity to pedestrians. Then we shall formalize these situations as system and environment states and formalize self-adaptive behavior (e.g., as self-* objectives) driving the car in such situations. For example, a situation could be defined as “all the car’s systems are in operational condition and the car is passing by a school”. To increase safety in this situation, we may formalize a self-adaptive behavior such as “automatically decrease the speed down to 20 mph when getting in close proximity to children or a school”.

Further, we need to specify situations involving close proximity to pedestrians (e.g., crossing pedestrians) and car states emphasizing damages or malfunction of the driving system, e.g., flat tires, malfunctioning stirring wheel, malfunctioning breaks, etc. For example, we may specify a self-adaptive behavior “automatically turn off the engine when the break system is malfunctioning and the car is getting in close proximity to pedestrians.”

其他重要情况应涉及在道路上引入危险,例如雪风暴,冰,低能见度(正式为环境状态),以及汽车靠近行人的危险。在这种情况下,正式的自适应行为应自动强制执行低速,转动灯,转动刮水器等。


卢克: To what degree, if any, can we demonstrate both deductive guarantees and probabilistic guarantees with today’s formal verification methods?

In a sense, many of the deductive proofs for safety properties in today’s formally verified systems are already “probabilistic” in the sense that the designers have some subjective uncertainty as to whether the formal specification accurately captures the intuitively desirable safety properties, and (less likely) whether there was an error in the proof somewhere.

But the question I’m trying to ask is aboutexplicitlyprobabilistic guarantees within the proof itself. Though perhaps to do this, we’d first need a solution to the problem of how a formal system or logical calculus can have internally consistent probabilistic beliefs about logical sentences (a laDemski 2012要么Hutter et al. 2012).

If we could generalize formal specification and verification procedures to allow both for deductive guarantees and probabilistic guarantees, perhaps we could verify larger, more complex, and more diverse programs.


Emil: With deductive guarantees a formal verification actually provides true statements that demonstrate that desired safety properties are held. Such a verification process is deterministic and a complete proof is required to guarantee the correctness of safety properties. For example, such a proof can be equipped with deterministic rules and expressed in the classical first-order logic (or in high-order logic if we use Isabelle to run a deductive verification). On the other hand, with the probabilistic guarantees you can accept that a complete proof is not necessary and safety properties can be verified with some degree of uncertainty. Basically, the probabilistic guarantees can be regarded as a result of quantification of uncertainty in both the verification parameters and subsequent predictions. With the Bayesian methods, for example, we quantify our uncertainty as prior distribution of our beliefs we have in the values of certain properties. Moreover, we also embed likelihood in the properties formalization, i.e., how likely is it that we would observe a certain value in particular conditions. You may think about it as the likelihood of holding certain safety properties in specific situations. Then, the probabilistic guarantees assert in a natural way “likely” properties over the possibilities that we envision.

So, to answer your question, unfortunately, deductive guarantees can be provided only for simple safety properties, because their complete proof often unavoidably does not terminate. Although deductive verification may deal with infinite state systems, its automation is limited, which is mainly due to the decidability of the logical reasoning (first-order logic and its extensions such as high-order logic are not decidable, or rather semi-decidable). For example, if we go back to our example with the self-driving car, we may supply all the needed deterministic rules expressing our safety requirements (e.g., speed limit of 20 mph when passing by a school), but the complete proof eventually cannot be achieved, because although the desired conclusion follows from some of the premises, other premises may eventually lead to resolution refutation.

The probabilistic guarantees are not as complete as the deductive ones, but they may deal with more complex properties, e.g., where a larger number of states can be required. Of course, this tradeoff should be considered when evaluating the results of any probabilistic formal verification. So, if we go back to your question about how much confidence in system’s safety is gained with formal methods, probabilistic guarantees bring less confidence than deductive ones, but they may bring some extra confidence to safety properties that cannot be handled otherwise.

It is important to mention that abstraction is the most efficient solution to the state-explosion problem (and respectively, to the problem of deductive guarantees decidability). With abstraction the size of the state space is reduced by aggregating state transitions into coarser-grained state transitions. The technique effectively reduces the total amount of states to be considered but is likely to reduce the granularity of the system to a point where it no longer adequately represents that system. The problem is that although the abstract model (e.g, the formalization of safety properties) is relatively small it should also be precise enough to adequately represent the original system.

因此,正如您所说,为了获得更好的结果,我们将考虑两个验证方法,最终将这些验证。例如,我们可以正规化,推测,可以在某种组成核查中获得演绎和概率保障,我们可以将两种方法应用于不同的安全性,并最终将结果与全球安全不变的特征相结合。这样的不变性可以被归类为:目标不变,行为不变,互动不变性和资源不变(请参阅我的共同撰写的论文“验证自适应系统金宝博官方“)。


卢克: Besides mixing probabilistic and deductive verification methods, what other paths forward do you see for improving on our current verification toolset? In particular, what important work in this area seems to be most neglected by funding and researcher hours?


Emil: Well, maybe the most popular technique for formal verification is model checking where the properties are expressed in a temporal logic and the system formalization is turned into a state machine. The model checking methods verify if the desired properties hold in all the reachable states of a system, which is basically a proof that properties will hold during the execution of that system. State explosion is the main issue model checking is facing today. This problem is getting even bigger when it comes to concurrent systems where the number of states is exponential to the number of concurrent processes. So, basically, model checking is an efficient and powerful verification method, but only when applied to finite, yet small state spaces.

在这里,回答你的问题,可以做什么ne to improve the current verification toolset: on the one side we need to work on fully automated deductive verification based on decidable logics with both temporal and probabilistic features, and on the other side we need to work on improving the model checking ability to handle large state spaces (e.g., symbolic model-checking, probabilistic model checking, etc.).

Important work that seems neglected by the scientific community is the so-called stabilization science, which provides a common approach to studying system stability. In this approach, a system is linearized around its operating point to determine a small-signal linearized model of that operating point. The stability of the system is then determined using linear system stability analysis methods such as Routh-Hurwitz, Root Locus, Bode Plot, and Nyquist Criterion. We may use stabilization science to build small-signal linearized models for the different system components, anticipating that the linearized models of system components will yield a relatively small state space, enabling for their efficient verification (see again my co-authored paper “验证自适应系统金宝博官方“)。然后我们可以应用组合验证技术来产生整体系统范围的验证。金宝博官方

其他,并非发达良好的验证技术是与自动测试用例的生成和仿真相关的技术,这可能降低测试成本并提高测试质量。例如,可以从具有特定于域的正式语言构建的系统的正式规范生成测试用例。金宝博官方如果结合代码生成和分析技术,用于有效的测试用例(例如,变更影响分析),可以使用自动测试案例生成在模拟条件下有效地测试系统行为(请参阅我的共同撰写的论文“金宝博官方Automated Test Case Generation of Self-Managing Policies for NASA Prototype Missions Developed with ASSL“)。

Moreover, high-performance computing can be used for parallelizing simulations, which will allow multiple state space explorations to occur simultaneously, etc.


卢克: Thanks, Emil!