[tcpdump-workers] BPF Extended: addressing BPF's shortcomings

Darren Reed darrenr at netbsd.org
Thu Jun 11 07:05:20 EDT 2015


On 11/06/2015 9:31 AM, Mindaugas Rasiukevicius wrote:
> Darren Reed <darrenr at netbsd.org> wrote:
>> Extending BPF
>> =============
>>
>> Introduction
>> ------------
>> BPF was originally designed to provide very fast packet matching
>> capabilities for IPv4 but as a result of its generic nature, is
>> capable of being used for just about any protocol. With IPv6 the
>> limitations of BPF became apparent.
>>
>> ...
> Conceptually, I like the idea of an extended BPF instruction set.  There
> are several important questions here.  First, what is the exact problem we
> want to solve with a new instruction set?  Is it just the IPv6 handling?
>
> I do find the BPF byte-code useful as a general purpose instruction set
> i.e. for the use as a universal virtual machine.  This was also one of the
> driving forces behind the Linux eBPF.  They use it beyond packet filtering.
> Note that LLVM has recently gained support for the eBPF backend.  If that
> is the objective, then there is a wider spectrum of the requirements.
>
> Specifically, I would like to see:
>
> - Capability to jump backwards.  Basically, the general purpose instruction
> set ought be Turing-complete.  Obviously, with a way to enable/disable this
> depending whether the user needs bpf_validate().

Do you have any thoughts on what sort of conditions this should be 
allowed in?

Guy's suggestion is intriguing and suggests to me something like this:

     If condition is true, jump (either forward or backward?) AND
     move the start of packet pointerforward by # > 0 bytes.

Rather than try and detect the move of the start of packet pointer,
force it to occur as part of the loop instruction?

>
> - 32-bit jump offsets.  Currently, they are 16-bit which is quite limiting
> if you have a larger BPF program.

Yes, agreed.

>
> - Opcode extended to 32-bits.  It seems we agree on this, although this
> can be debatable.  The classic BPF byte-code has a simple, minimalistic
> RISC-like instruction set (with the exception of BPF_MSH hack).  I would
> be inclined to keep it that way instead of polluting the, quite limited,
> instruction space with various arbitrary mechanisms, but this is somewhat
> philosophical RISC vs CISC debate.  Nevertheless, if the general feeling
> is to go with complex instructions, then we could at least dedicate a wide
> range for them.

What are RISC and CISC these days but a layer over microcode?

Consider that when BPF was developed, 32MB of RAM was a lot of memoryand
the use of long options with getopt was almost unheard of.Today thesize
of theinstruction might almost be considered to be a hindrance to 
performance
asaccessing the individual bytes is no longer performance friendly andwill
often result in the CPU fetching a complete word anyway. Thus the compact
size of the BPF instruction is no longer the benefit it once was.


>
> - Support for 64-bit words, but not quite convinced about 128-bit words.
> Do you want to add them just to accommodate IPv6?  Why not to leave this
> for the byte-code generator/compiler?

Considering Michael's comments about concentrating on what opcodes the 
compiler
should generate, it may be a better idea to focus on 128bit words rather 
than 64bit
words because it is rather elegant to have the compiler produce 
instructions that
allow for the address to be used in a single instruction rather than 
needing to do
several.

As a for example, it is easier to mentally inspect and verify the following:

ldq  [32]
jeqq #0x01    jt 2    jf 3

than to try and follow half a dozen instructions. Now when it comes to 
making
changes to those values, e.g. to apply a netmask:

ldq  [32]
andq #0xfffffffffffffff00000000000000000
jeqq #0x2002000100000a900000000000000000....

How many instructions does BPF need to emit today to do that?
Which is easier to understand?

Not only that but if you are writing manual instructions to do math
(such as addition or subtraction), it becomes harder again to ensure
what's happening is correct.Do you really want to do multiple 32bit
instructions with BPF to represent adding together two 128bit values?

I would rather have instructions with larger operands that are easier
for the parser to generate and let the interpreter (or JIT) worry about
how to execute them.


>
> - External scratch memory store or a way to initialise it before calling
> the BPF program.  Also, potentially arbitrary (dynamic) BPF_MEMWORDS size
> rather than hardcoded size.  Basically, the user/caller should be able to
> provide arbitrary data through the memory store.

What's the end goal here?

Thinking on the fly here, what if the bpf program had a data section on the
end of it or at the beginning? Although with JIT this may be starting to 
wander
down the road of inventing a object file format and that might be a bit 
too far.


>
> - BPF_COP and BPF_CALL.  When I added BPF_COP to NetBSD, I thought about
> the generic BPF_CALL to invoke *arbitrary* functions.  It requires solving
> some of the above problems first, but there is an important difference:
> BPF_COP allows program to invoke a *predetermined* set of functions, while
> the capability to invoke arbitrary functions through BPF_CALL can have
> security implications (yet it is more powerful).  I would like to see both,
> but with the ability to disable, at least, BPF_CALL.

My take on BPF_COP and BPF_CALL is that they are almost identical, 
except that
it ispossible to implement BPF_COP with BPF_CALL and not the other way 
around.
Is thata fair assessment? My examination of eBPF is that BPF_CALL doesn't
allow just any function or address to be called.


On 11/06/2015 3:32 AM, Michael Richardson wrote:
> Darren thank for taking a lead on this.
> The IPv6 header chasing problem is I think, the more important part of this.
>
> While I'm not against a re-architecture of BPF, I wonder if it will be
> widely accepted.

Carrots will be needed :)

>  From what I can see, it appears that BPF can be machine translated to BFPX,
> so it would permit one to deploy BPFX in such a way that would permit
> backwards compatibility, that's a good thing.

Yes, that was a design goal.

>
> One thing I learnt (too late) at a fabless semi company that was made a
> ram-intensive loop-less packet processing engine (simpler than BPF) was
> not to architect the instruction set and then the compiler.  We sound up
> creating instructions we never knew how to use, and we also wound up
> optimizing encodings of instructions that were seldom used.

Strange as it may seem, this almost tells me that doing 128bit 
instructions for
IPv6 work is more important than 64bit ones for the same or other things.

>
> So my advice is not to worry about the encoding, but rather to work on the
> BPF compiler (in libpcap), and see how you would use each instruction.

Ok, thanks for the tip.

Cheers,
Darren



More information about the tcpdump-workers mailing list