by shigemk2

当面は技術的なことしか書かない

Enhancement Activities on the Current Upstream Kernel for Mission-Critical Systems #linuxcon

We apply Linux to mission-critical systems.

  • Reliability
  • Availabirity
  • Serviceability

Do the systems satisfy these requiements in current upstream kernel?

Activities

1. Fix a deadlock

Memory dump deadlock

We get a memory dump via Kdump when serious problems, which induce panic or oops, occur.

Kernel crash dumping feature based on Dexec

Reason

The cause of the stop is deadlock on ioapic_lock in NMI context

Fixing

Fixed this problem by initializing ioapic_lock

Improve data reception latency

Serial RX interrupt frequency - introduction

Serial RX interrupt frequency - problem

A test requirement of a system was that the serial communication time between send and receive has to be within 3msec

When we measured the time on 16550A, it took 10msec

Serial RX interrupt frequency - reason

HW spec of 16550A

  • 16bytes FIFO buffer
  • Flow Control Register FCR(2bit register)

In Linux, the trigger is hard-coded as 8bytes.

Serial RX interrupt frequency - temporary fixing

Changed FCR as a test

Result

The inreruput frequency

Serial RX interrupt frequency - tunable patch

Added new I/F to the serial driver

Serial RX interrupt frequency - current status

This new feature is under discussion.

patch

https://lkml.org/lkml

PID process name table in ftrace - introduction

ftrace is n-kernal tracer for debugging and analyzing problems

ftrace records PIDs and process names

PID process name table in ftrace - reason

ftrace has saved_cmdlines file string the PID-process name mapping table

PID process name table in ftrace - current

Store map information in saved_cmdlines

Overwrite this element from 1045_1046

printk fragmentation problem - introduction

printk message outputs error logging or debugging information in kernel

printk fragmentation problem - introduction

Mixed messages can occur when multiple printk() are executed at the same time

printk fragmentation problem - introduction

Cuontinuous printk messages with KERN_CONT can be also fragmented

printk fragmentation problem - Solution

How to solve

Store all continuous messages in local buffer as temporary, and execute printk

Add nformation necesarry to merge all fragmented printk messages

printk fragmentation problem - key idea

What kind of information can we merge those by?

But, don't want to change dmesg and syslog messages.

printk fragmentation problem - no msg change

Key idea: Use the header information of /dev/kmsg

printk fragmentation problem - approach

Our approach (1)

  • Add process contex information (PID and interrupt context)
  • We can understand relatinship of mixed messages
  • This does not indicate where the message is fragmented

Test: Run 2 kernel threads

Our approach (2)

Current "Fragment Flag"

Three types

first fragment of a line all following fragments no fragment

Change the meaning of "fragment flag"

If a message is fragmented add "f" flag If a message has "\n", don't add any flag (i.e. '-')

printk fragmentation problem - current status

Current patch

Summary

We are doing commnity activities for realizing Linux which satisfies RAS requirements for mission-controll