LinuxCon Attendance 2015

I was hoping for a long time to take part of LinuxCon, and last year I have attended it (plus EmbeddedCon) for the first time, together with taking part to a panel in which some of us from Outreachy Round 9 have presented and discussed about our work during the internship. It was an interesting experience, having to present in that environment, especially as I am usually a pretty shy person. So I really wanted overcome this and present my work on the project in a  clear and comprehensive manner.

I had been working with Josh Triplett on Linux Kernel Tinification project. I really appreciate having him as a mentor during the internship and I wished to have met him at the conference. 

I think one of the nicest things at this conference is that the subjects of the presentations are extremely varied, which I think has two advantages: learning about communities around different projects which helps having a wider view over the active development work on Linux kernel (so one can find out about a project, discuss with people around it and easier start contributing); impossible not to find interesting presentations that fit with each person interest. Of course, the downside is that one cannot attend all of them, which is unfortunate… But, hey!

The two presentations that immediately come into my mind from the conference are the one about seccomp and it’s applications (Michael Kerrisk was indeed a great presenter) and Use “strace” To Understand Your Shell (BASH) (self-explanatory presentation name :)).

I cannot escape mentioning the closing game of the conference (rock-paper-scissors spock lizard), which was really entertaining and fun.

Other than than, Dublin is one of the friendliest cities I have ever visited. 🙂

I am happy that I had this opportunity, thanks to Linux Foundation, through the Outreachy program. Now I am enthusiastically waiting for the Berlin Embedded Linux Conference!

 

 

Building a root only system

Some of the embedded systems that run Linux, usually have to run a small number of specific processes. This kinds of tasks usually are run only by the root user, with full permissions. As controversial as it may seem, adding an option to run a root-only Linux kernel may prove to be a valuable feature for some applications.

The kernel API for retrieving group and user ids is based on these two functions:

static inline uid_t __kuid_val(kuid_t uid)
static inline gid_t __kgid_val(kgid_t gid)

These two functions return actual uid/gid number from the kuid_t/gkid_t structures. As all the permission checks are done using wrappers over these functions, a sensitive idea is to make them always return the root uid/gid (0) in a root-only system. This way a great amount of code would be shed by constant folding procedure in the compiler.

Because many of the permission checks are done for the 0 uid/gid, the code handling the non-zero case won’t ever be executed so it can be removed. As the bloat-o-meter script shown, this change only removes around 25k from the final kernel image. Considering a tiny build has around 1000k uncompressed, this apparently trivial change gets to decrease the kernel size by 2.5%.

The patch implementing this change also removes code that is useful only in multi-user systems, such as uid and gid related syscalls and capabilities. If the community sees value in this change, it should be included in the next release.

Identifying syscalls (part2)

First problem that appears in the previous post is that objdump is arch specific, so decompiling for ARM, for example, would need a different implementation of objdump. This is why, in order find all the system calls made in userspace, it is better to use nm, which will include all the calls to libc.

In order to keep a list consisting only of syscalls, we will intersect the ouput of nm with a list resulting from a simple grep in kernel/sys_ni.c that gives us all the possible syscalls that can be conditionally compiled. And this will filter out the first obtained list. So we will have something similar with:

[‘uselib’, ‘io_submit’, ‘io_setup’, ‘madvise’] (1)

list of all syscalls from kernel/sys_ni.c (2)

(2) \ ((1) ∩ (2)) => [list of all syscalls that we don’t need to compile in]

Furthermore, we need to match each syscall with the corresponding symbols that compile it out. This is obtained by parsing all source files and Makefiles in the kernel tree, following the next steps:

– use a stack in order to know between which ifdef and endif a syscall is defined;

– keep a dictionary where the key is the syscall and the values are all the symbols that it depends on and the conditionals between them;

Having all of these done, we can easily combine them and obtain two simple lists [1]. The output is only a suggestion, as opposed to automatically setting the given symbols to ‘no’, for two reasons:

 

– some of those symbols that can be set to ‘no’ (considering syscalls) may compile out some code that is useful for the developer;

– the obtained Kconfig options can have dependencies which need to be solved by hand.

[1] http://pastebin.com/iZ5AcVw2

Identifying Syscalls (part 1)

Briefly, in usual Linux distros, one needs libc in order to be able to link/use the commonly known syscalls (e.g. open,read…). When a userspace program calls, for example, ‘open’, it actually calls a wrapper implemented in libc. We need wrappers as we can’t directly call kernel code.

To make a syscall we must place the syscall number and its arguments in some arch specific registers, run an instruction that puts the current process in wait and makes the kernel start running. The kernel knows that it has to handle a syscall, takes the number from the first register and calls the associated routine. After the result (the handling was successful or not) is put yet into a particular register, it returns to user mode.

Given this, you could write your own syscall wrapper library, only consisting of wrappers that you know your application will use.

What we will do next is try to find what syscalls a given userspace uses by analyzing an object file and trying to identify the snippets that implement syscalls.

The first trial is made on my own libc, libc.so.6 (for x86_64). We decompile it (“$ objdump -lD”) and search for, let’s say, renameat:

00000000000574d0 <renameat>:
renameat():

.......

574d9: b8 08 01 00 00 mov $0x108,%eax
574de: 0f 05          syscall

If we search the syscalls table that matches the syscall name with it’s actual number, we conclude that the number of that syscall is kept in the eax register (in out case, 264). So, the first solution that comes to mind is searching for int literals that are moved into the %eax register. Although this will identify all syscalls, it will also have false positives as the register can be used for other opperations.

In case we have a real small, single purpose, embedded system, we may want to use only a few syscalls, of which we are sure our application will need. Then why not compile only these few kernel correspondent functions? For this, we need to detect those syscalls and then make it possible in the kernel to compile them out (for example, automatically generate a .config that builds a smaller kernel image supporting that userspace). In the next thread, we’ll see how it works.

Hello World, s/World/FOSS/

With this blog, I am trying to present the tinification kernel effort as I run against its issues and implicitly my way of getting over them and the resulting contributions. The purpose is to make the kernel easier to fit with embedded systems, mostly in size, by being able to include the core features required by each specific system, while not damaging the overall performance.

Why this subject? I am currently engaged in OPW, working on the Kernel Tinification Project.

There are two main reasons I really like this project: first of all, as things may be modified wherever in the kernel in order to make it smaller, I can get a glimpse of multiple key kernel functionalities; secondly, the idea that there are lots of ways to decrease the kernel size that can still be exploited is both tempting and useful. Considering these, I think the hardest part is to find balance between size, speed and code readability.

I will come back soon with details regarding the approaches I choose in order to contribute to this effort. 🙂