Intel Software Careers Website

I was recently featured on Intel’s software careers website:

http://www.intel.com/jobs/careers/software/performance/

http://www.intel.com/jobs/careers/software/

August 13, 2011 · James Kukunas · No Comments
Posted in: Uncategorized

GECCO 2010 Presentation

This morning, Wednesday July 7th, I presented my paper, “A Genetic Algorithm To Improve Linux Kernel Performance on Resource-Constrained Devices”, at the GECCO 2010 conference in Portland, Oregon.

The slides from this presentation can be found here

July 8, 2010 · James Kukunas · No Comments
Tags:  Â· Posted in: Uncategorized

x86 Linux Networking System Calls: Socketcall

In a monolithic kernel, such as Linux, networking operations are performed within kernel space. This is clearly seen on architectures such as the DEC Alpha, where system calls, the fabric that connects user space and kernel space, exist for socket operations such as connect, listen, and bind. However, this is not the case with the x86 platform. Instead, on the x86 platform, all socket operations are multiplexed through one system call, socketcall.

Socketcall takes two parameters. The first parameter is an integer which specifies which call to execute. The values and their respective function are listed in /usr/include/linux/net.h, and are reproduced below:

$ grep SYS_ /usr/include/linux/net.h
#define SYS_SOCKET      1               /* sys_socket(2)                */
#define SYS_BIND        2               /* sys_bind(2)                  */
#define SYS_CONNECT     3               /* sys_connect(2)               */
#define SYS_LISTEN      4               /* sys_listen(2)                */
#define SYS_ACCEPT      5               /* sys_accept(2)                */
#define SYS_GETSOCKNAME 6               /* sys_getsockname(2)           */
#define SYS_GETPEERNAME 7               /* sys_getpeername(2)           */
#define SYS_SOCKETPAIR  8               /* sys_socketpair(2)            */
#define SYS_SEND        9               /* sys_send(2)                  */
#define SYS_RECV        10              /* sys_recv(2)                  */
#define SYS_SENDTO      11              /* sys_sendto(2)                */
#define SYS_RECVFROM    12              /* sys_recvfrom(2)              */
#define SYS_SHUTDOWN    13              /* sys_shutdown(2)              */
#define SYS_SETSOCKOPT  14              /* sys_setsockopt(2)            */
#define SYS_GETSOCKOPT  15              /* sys_getsockopt(2)            */
#define SYS_SENDMSG     16              /* sys_sendmsg(2)               */
#define SYS_RECVMSG     17              /* sys_recvmsg(2)               */
#define SYS_ACCEPT4     18              /* sys_accept4(2)               */

The second parameter is a pointer to an array of parameters for that corresponding call.

For example, to create and then close a TCP socket:

        pushl        %ebp
        movl         %esp, %ebp
        subl         $16, %esp        

        # parameters for socket(2)
        movl         $2, -12(%ebp) # PF_INET
        movl         $1, -8(%ebp)  # SOCK_STREAM
        movl         $0, -4(%ebp) 

        # invoke socketcall
        movl         $102, %eax          #socketcall
        movl         $1, %ebx            #socket
        leal         -12(%ebp), %ecx     #address of parameter array
        int          $0x80
        movl         %eax, -16(%ebp)

        # close socket
        movl         $6, %eax        # close
        movl         -16(%ebp), %ebx # load socket fd
        int          $0x80

        addl        $16, %esp
        popl        %ebp

Let’s dissect this code.

The first three instructions setup a stack frame and create enough space on the stack to store both the file descriptor of the socket and the array of parameters to pass to the socket call. Because the stack on an x86 machine grows downwards, I subtract from the stack pointer to make room for these local variables. I subtract 16 bytes because the socket call requires an array of 3 unsigned long pointers, 4 bytes each, and thus we need 12 bytes to hold the parameter array, and then the file descriptor for the socket is 4 bytes, for a total of 16 bytes.

The next three instructions set the array with the desired parameters to the socket call. Notice that the values in the array are identical to the values that would be passed to the socket function.

The next four instructions invoke the socketcall system call. On the x86 platform, system calls are performed in one of three ways. In the first method, the system call number is loaded into the EAX register, and then the parameters are loaded into the EBX, ECX, EDX, ESI, EDI, and EBP registers respectively. The system call is then invoked using interrupt 128. Another way to invoke a system call is to load the system call number into the EAX register, and then the parameters are loaded into the EBX, EBP, EDX, ESI, and EDI registers respectively. The system call is then invoked using the SYSCALL instruction. The third method to invoke a system call is to load the system call number into the EAX register, and then the parameters into the EBX, ECX, EDX, ESI, EDI, and EBP registers respectively. Finally the system call is invoked using the SYSENTER instruction. Each of these methods are documented in arch/x86/ia32/ia32entry.S of the kernel source.

In this case, I use the first method. First the system call number, 102 for socketcall, is loaded into the EAX register. Then the first parameter, derived from the table above, is moved into the EBX register. The address of the array, containing the parameters to socket(2), is then loaded into the ECX register. The last of the four instructions invokes the system call using interrupt 128. At this point, a socket is created and the file descriptor is returned in the EAX register. The movl instruction after the interrupt saves the file descriptor into the local variable we created earlier on the stack for future use, since the EAX register will be clobbered.

At this point, subsequent socketcall invocations would allow for the socket to be used in various networking functions. Afterwards, it is necessary to close the socket. Since a socket is a file descriptor, it can be closed just like any other file descriptor, using the close system call. Finally, once the socket file descriptor is closed, the stack frame is restored by adding the 16 bytes we subtracted from the stack pointer, effectively deleting the local variables initially created, and then restoring the preserved EBP register.

Hopefully within the next week or so, I will post a full example application that utilizes this interface. It is important to note that, due to the platform specific nature of this system call, it should not be used in applications that claim to be portable. This article aims to provide insight into the underlying mechanisms, rather then a tutorial of socket programming in Linux.

May 23, 2010 · James Kukunas · 4 Comments
Tags:  Â· Posted in: Uncategorized

Graduation

On May 16, I graduated from Allegheny College with a Bachelor of Computer Science degree.

As time permits, I will begin to upload information regarding my undergraduate thesis work at Allegheny. The technical report, explaining the motivations, implementation, and findings of my thesis, can be found here.

May 17, 2010 · James Kukunas · No Comments
Posted in: Uncategorized

Honors Convocation

At this year’s Allegheny College Honors Convocation, I was awarded the Allegheny College Student Chapter Prize of the ACM for best senior thesis in the field of Computer Science.

Along with the prestige of this award, I was also gifted The Computer Science Handbook, written by Allen B. Tucker, who I had the pleasure of meeting two semesters ago. Also, three of my professors: Dr Gregory Kapfhammer, Dr Robert Cupper, and Dr Robert Roos, have written sections in this compilation.

May 5, 2010 · James Kukunas · No Comments
Posted in: Uncategorized

RICSS Talk on Distributed Version Control and Git

I was invited to give a talk this Friday December 04, 2009 at the Research in Computer Science Seminar hosted by Allegheny College. The talk aims to explain distributed version control systems by exploring git. The announcement poster can be found below. The slides for the presentation and the beamerposter style I created for the poster will be posted here shortly.

RICSS Announcement Poster

RICSS Announcement Poster

EDIT: The slides from the presentation can be found
here

December 2, 2009 · James Kukunas · No Comments
Tags:  Â· Posted in: Uncategorized

Acer Aspire One Kernel Fedora – Linux DNA

If you are like me, you enjoy squeezing every ounce of performance from your machine, and when working on a constrained platform such as a laptop, each ounce of performance provides crucial benefits to aspects such as battery life and throughput.

Typically, the first thing I do after a fresh Linux install is build a custom kernel, however, when I built the kernel for my Acer Aspire One, the GNU C compiler did not contain as many optimizations for the unique Atom platform as the Intel C Compiler. Information about the Intel compiler and specific Atom optimizations are already well documented and can be found here Thus, it became my goal to compile the Linux Kernel with the Intel
C Compiler. That is when I stumbled upon the LinuxDNA project.

If you haven’t heard of the LinuxDNA project , they are patching the Linux kernel to build with the Intel toolchain. Using their latest patch for kernel 2.6.30.5 and the Intel Compiler, version 11.1 046, I undertook building a highly optimized Linux kernel for the Acer Aspire One.

To do so, we must first download the vanilla kernel source. The LinuxDNA patch we are using here applies to 2.6.30.5. After downloading, uncompress the kernel

$ wget ftp://ftp.kernel.org/pub/linux/kernel/v2.6/linux-2.6.30.5.tar.bz2
$ tar -xjvf linux-2.6.30.5.tar.bz2
$ cd linux-2.6.30.5

Next, we must obtain and apply the LinuxDNA patch.

$ wget http://www.linuxdna.com/dna-32bit-icc-2.6.30.patch
$ patch -p1 < dna-32bit-icc-2.6.30.patch

Next, alter the makefile to add custom compiler flags for the intended architecture. Depending on your platform, these flags vary.Now, customize and build the kernel as usual.

$ make menuconfig
$ make bzImage ; make modules

Finally, install the kernel and add a grub configuration to boot the new kernel. The grub.conf entry will depend on your kernel and system configuration.

# cp arch/x86/boot/bzImage /boot/linux-2.6.30.5_DNALINUX
# make modules_install
# mkinitrd /boot/init-2.6.30.5_DNALinux `ls /lib/modules/ | grep 2.6.30.5`
# vi /boot/grub/grub.conf

With a fully customized kernel for the Acer Aspire One, built with the Intel C compiler, and a modified rc.sysinit, I achieved the following bootchart. More benchmarks are on the way.

31 second boot

31 second boot

For those with an Acer Aspire One running Fedora, the source and binary RPMs of the kernel I built are on the way. I plan on continuing to make improvements for the Acer Aspire One as time permits. Hopefully, you will find them of value to you. Feel free to let me know of any feature requests.

November 16, 2009 · James Kukunas · No Comments
Tags: , , , ,  Â· Posted in: Uncategorized

Git presentation

For my Principles of Software Engineering class, I had to give a 10 minute presentation on the Git revision control system.

The slides from that presentation can be found here.

The tutorial session file, created with the script command, can be found here. The timing file can be found here

To watch the tutorial, download the timing and session files and use the scriptreplay command:

$ scriptreplay git_tutorial.timing git_tutorial.session

October 3, 2009 · James Kukunas · No Comments
Tags:  Â· Posted in: Uncategorized

C+0x Lambda Functions

Currently, Microsoft is offering a sneak peak at Visual Studio Professional 2010. While I usually prefer the Intel C++ compiler, this release is interesting because it contains Microsoft’s new C++ compiler, which contains support for some of the new C+0X features. A complete list of changes in this new Visual Studio release can be found here

One of the new C+0x features supported is lambda functions and expressions. To get a feel for the new syntax, let’s look at a trivial application.

int main(int argc, char** argv)
{
	for(int i = 0; i < [=](int x)->int{ return x;}(10); i++) {
		std::cout << i << std::endl;
	}
	std::cin.get();
}

This trivial application simply outputs x numbers, from 0 to x-1, where x is the parameter to the lambda function inside the loop constraint. Let’s examine the new lambda function syntax.

[=](int x)->int{ return x;}(10)

The first part in square brackets is called the capture clause. This tells our lambda function how to access variables in the current scope. The two possible values are an ampersand, which means the variables are accessed by reference, and an equal sign, which means the variables are accessed by value. In the example provided above, the variables are accessed by value.

The next part in parenthesis contains the parameters to the lambda function. In the example provided above, there is only one parameter, which is the integer x.

The part after that specifies the return value. This part is not required if the entire function body is contained within a return clause. In the example provided above, this part is technically not required but is shown for completeness.

Finally, we have the function body which follows the typical function syntax and then the parenthesis which call the function. In the example provided above, the function is called with x = 10.
Lambda functions and expressions are not the only C+0x feature supported by Microsoft’s new C++ compiler, however I find this to be a huge improvement over the Boost Lambda library.

I am very curious to see how the C+0x implementations will change the face of C++ over the next few years, especially considering the concerns I have over the new design philosophy to give C++ “facilities supportive of novices.”

June 17, 2009 · James Kukunas · One Comment
Tags: ,  Â· Posted in: Uncategorized

Summer 2009

Starting last Wednesday (5/13/2009), I resumed my internship with the Data Management Team at Impaqt. I’m currently not at liberty to discuss the project that I am leading due to its proprietary nature, but I can say that it is a very interesting project that I hope to be able to post about soon.

May 21, 2009 · James Kukunas · No Comments
Posted in: Uncategorized