The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Wednesday, September 28, 2011

Xfinity failing every afternoon 2:30-6pm approx; comcast DNS problems fixed by google

Xfinity failing every afternoon 2:30-6pm approx; comcast DNS problems fixed by google

My comcast internet cvonnection was failing regularly, every weekday, starting at approximately 2:30pm, lasting until 6 or 7pm. This started happening shortly after labor day.

I conjecture that this was probably happening because schoolkids were getting off school (which in my area happens circa 2:30pm) and starting to use the net.

The failures were accompanied by error meessages of the form "DNS error... DNS lookup failure... Error 105 net:ERR_NAME_NOT_RESOLVED". That error message was from a Windows 7 machine. I also regularly have a Vista machine and an Apple Mac connected to my home network. The Vista machine occasionally worked when the Windows 7 machine did not, and the Apple Mac more frequently worked; but all regularly failed about the same time in the afternoon. Yes, I tried powering off/disconnecting/connecting only 1 machine at a time. Not a cure.

Coincidence? I think not.

I reported the problem to Comcast, who very quickly sent over a "cable guy" who apologized that my 7 year old Linksys cable modem had not been replaced. So now I have a new SMC cable modem, and the net performs much better, up to 10X faster than in earlier speed tests. This is great!

But, it did not fix the DNS problem.

My Windows 7 machine (actually, the Windows 7 laptop my employer gave me) still got DNS errors between roughly 2:30 and 7pm.

My Vista machine sometiimes ran when the Windows 7 machine failed.

And the Apple Mac seemed nearly always to work, failing only occasionally.

But the Windows 7 machine reliably failed between 2:30 and 7pm.

OK, so I get more serious. I check the DNS settings. Finally, I do what I should have done in the first place: I switch the Windows 7 machine DNS server frm "automatic", i.e. from comcast's DNS servers, to Google's DNS servers, and

Fixed! :-)

Conclusion: Comcast DNS servers have problems. Particularly when the kids get home from school.

But I'm happy that I got a new, faster, cable modem out of this. And it is interesting that different machines with their different OSes behaved differently. Different timeouts (DNS settings that don't appear at the usual configuration places)? BTW, the machine that failed most often is the fastest machine by far.


BTW, I did try fixing DNS earlier, to no avail. OpenDNS did not seem to work. But I wasn't that serious, so I did not try all possibilities. It is possible that both the cable modem needed to be upgraded, AND the DNS needed to point away from comcast's DNS servers.

Saturday, September 17, 2011

Pipelined Bloom Filter


= [[Pipelined Bloom Filter]] =

Occasionally one wants to keep track of what (micro)instructions are in a pipeline.
For example, an aggressively speculative x86 may need to
[[snoop the pipeline]]
to detect [[self-modifying code]] that writes to instructions
that are already in the process of being executed.

A brute force technique might be to [[CAM]] all of the instructions in the pipeline.
Or to mark I$ lines or even ITLB pages when instructions from them are fetched for execution.
These can be expensive and/or complex.

This suggests a Bloom filter implementation:
set bit(s) in a bitvector according to a hash of the physical address of instructions
as the enter the pipeline.

unfortunately, a straightforward Bloom filter only ever sets bits.
It only ever accumulates. Eventually this will reduce in false conflicts being detected.

The basic idea behind a [[pipelined Bloom filter]] is to have a set of Bloom filters.
E.g. 2 1024 bit Bloom filters.
A new Bloom filter is (re)allocated by clearing it every 128 instructions.
It accumulates bits as instructions as executed for the next 128 instructions.
It is checked, but new instructions are not allocated into it, for the next 128 instructions.
It is then reallocated, and cleared.
Two such filters therefore can ping pong, providing coverage to the pipeline.

== Generalized ==

A set of K filters, used to protect a pipeline.

Use filter j for a specified period of time
- e.g. L instructions, or perhaps until B bits are set,
or C entries in a counted Bloom filter are saturated.
Switch to the next.
During this period one sets bits in the filter as new items are placed in the pipeline.

After this "allocation" period, continue to use the filter for checking, without writing new entries.

Deallocate the filter when it is guaranteed that no items inserted into the filter remain in the system, i.e. the pipeline.

== Other Terminology ==

It appears that there are some papers that use the term "pipelined Bloom filter"
to refer solely to a physical or electronic pipeline, and not the the logical structure
of overlapping Bloom filters that can be deallocated.

Saturday, September 10, 2011

Google Desktop (Search) Dying

Google is end-of-life-ing several products, including Google Desktop, in particular Google Desktop Search, which I use constantly, and Google Notebook and Sidewiki, which I would like to use, but which, fortunately, I decided not to become dependent on.

Google says

    Desktop: In the last few years, there’s been a huge shift from local to cloud-based storage and computing, as well as the integration of search and gadget functionality into most modern operating systems. People now have instant access to their data, whether online or offline. As this was the goal of Google Desktop, the product will be discontinued on September 14, including all the associated APIs, services, plugins, gadgets and support.

Fair enough. I *am* trying to keep most of my information in the cloud. But unfortunately I do not have full connectivity - large areas of rural Oregon still do not have cell phone connectivity, voice, let alone data, and, for that matter, I am too cheap to pay for data connectivity at all times. Plus, some things I am not allowed to keep in the cloud, and

So it looks like I will have to use Microsoft Desktop Search, as built in to Windows 7, for stuff on my laptop. I prefer the Google user interface - but since Google Desktop Search will be discontinued, I don't have much choice. And because I am using Microsoft Desktop Search, I have started using Bing for my web search. Here, I still prefer the Google user interface, which I am familiar with - but Bing's user interface is more similar to MS Desktop Search, and I value consistency between Desktop and Web more than I value the more familiar Google Search interface.

I suspect that Microsoft Desktop Search cannot search my Google Chrome browser history. Just like Google Desktop Search could not search my Microsoft OneNote.

Unfortunately, this begins to look like dominoes falling. So long as I have substantial local data on my Microsoft PC, I am attracted, despite my personal preferences, to Microsoft tools like Microsoft Desktop Search, Bing, and Internet Explorer.

Not everyone will be in this situation. Some people will be purely cloud based. But, unfortunately, I am not. Yet?

I wonder if this will affect my next choice of cell phone? I was leaning towards Android, but perhaps this is enough to tilt me towards Windows 8.

Monday, September 05, 2011

Bit Interleaving, Instruction Set Support for


= Instruction Set Support for Bit Interleaving =

== What is Bit Interleaving? ==

[[Bit interleaving]] is a very common form of multi-operand bit permutation.
It has been important enough to occasionally warrant special instruction set support.

2-way bit interleaving of A and B;

for i = 0 to N do
result.bit[2*i] := A.bit[i]
result.bit[2*i+1] := B.bit[i]

3-way bit interleaving off A, B, C:

for i = 0 to N do
result.bit[3*i] := A.bit[i]
result.bit[3*i+1] := B.bit[i]
result.bit[3*i+2] := C.bit[i]

M-way bit interleaving of A[i]

for i = 0 to N do
for j = 0 to M do
result.bit[N*i+j] := A[j].bit[i]

Note that since bit interleaving combines multiple operands into a single operand, issues of overflow may arise.
For example, the result is often used in address arithmetic,
in which case the inputs (aka the coordinates)
are probably not full width addresses.
The bit interleaving may be used in only a subrange of the input bits.

== Why Bit Interleaving? ==

See [[http://en.wikipedia.org/wiki/Morton_number_(number_theory) Wikipedia Morton numbers]].

Bit interleaving is often used as a component of hash functions.

In particular, bit interleaving of 2D and 3D coordinates in computer graphics
often produces hash functions that have better cache locality than uninterleaved coordinates used to index an array.

== Bit Interleaving Instructions ==

Intel Larrabee has 1:1 and 2:1 interleave instructions, in both scalar and vector form:

vbitinterleave11pi: 1:1 bit-interleave vectors
vbitinterleave21pi: 2:1 bit-interleave vectors

bitinterleave11: 1:1 bit-interleave scalar
bitinterleave21: 2:1 bit-interleave scalar

(See [[http://drdobbs.com/architecture-and-design/216402188?pgno=5 Mike Abrash's Dr. Dobb's article]] which comments that
"is useful for generating swizzled addresses, particularly in conjunction with vinsertfield, for example in preparation for texture sample fetches (volume textures in the case of vbitinterleave21pi".)

The 1:1 interleaving instruction obviously accomplishes 2D interleaving,
i.e. the interleaving of two coordinates for a 2D system.

The 2:1 interleaving instruction accoplishes 3D interleaving.
Or, rather, first one interleaves two coordinates, say X and Y, using 1:1,
and then one interleaves the third coordinate, Z, using 2:1.

4:1 interleaving is accomplished by two 1:1 interleaves.
However, operations on objects and spaces of more than 1D, 2D, or 3D are much less common.

== Expand (Interleave with 0) ==

Although 1:1 and 2:1 interleaving "fit" into commion instruction sets,
higher degrees of interleaving may not - too many operands may be required.

Some support may be afforded by bit expansion,
or interleaving with 0.
This takes 2 (or 3) inputs:
* the bitvector to which 0s will be inserted between elements
* the number of 0 bits to be inserted between elements
* possibly an offset to cause the result to be aligned appropriately for the last step.

The last step would be ORing the results together.
(Or equivalently, ANDing).