person/dlofaro

Using a silicon retina for playing card recognition
“System 2: 2D FFT based”
(Created and conducted at the 2010 Neuromorphic Workshop in Telluride, CO)
2010-07-15
Using a silicon retina for playing card recognition. The goal of this project was to create a system that uses the biologically inspired DVS-128 silicon retina (SR) to detect a playing card and determine the card’s value and suit. The DVS-128 is an event based visual system that detects changes in relative intensity in a live image. An important attribute of this system is that it is asynchronous meaning that it sends event data (changes in the image) back as it happens and not at a fixed frame rate. The silicon retina only detects changes in relative intensity which means it will only “see” moving or changing objects. Thus in order for the system to see the playing card the image needs to be moving. This is similar to how our eyes work. If we stair at an unchanging scene for an extended period of time without eye movement the scene will be perceived as becoming dulled then it will fade away. Our eye’s saccade (quick simultaneous movements of both eyes in the same direction) is believed to help elevate this.

System 2 consists of a DVS-128 silicon retina (simulate of the human retina) attached to a pan tilt unit (simulate of the eye’s saccade). The silicon retina can now rotate on the x and the y axis. The x and y axis are defined as the image plane of the silicon retina’s imaging unit. The system (System 2) can be seen in the image below.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s2/image001.jpg

Above is System 2. On the right of System 2 is the silicon retina with pan-tilt unit (SR-PT). Attached to the SR-PT is a light blocking system (the brown board). This system blocks the overhead lights in the building because they flicker between 100Hz and 120Hz. This frequency is within the SR’s pass band. A constant light source is supplied via a high intensity white light flashlight. The playing card is placed on a monotone background (in this case it is white but any color will surface). The SR-PT will move slightly in reference to the card simulating the eye’s saccade. An image produced in real time by the SR-PT can be seen below.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s2/image002.jpg

The grey pixels represent no events. White pixels represent events where the intensity increased (over a specified relative threshold) and the black pixels represent events where the intensity decreased (over a specified relative threshold). Please note that the frame size of the above image is for 750 events. This means that the image is re drawn with the most recent 750 events.

How to determine the card:

The playing card is determined by:
- Making a template to check the cards against
- Checking the desired card against the template
Making the template:
- Record 10k events for each card
- Make a 2D event array (EA) where the x,y indices are the pixel locations. The x-y values at the indices are the number of times an event occurred at that pixel within the 10k events.
- Normalize the EA
- Take the 512 point 2D FFT on each of the EA (52 in total). This will create a unique 2D frequency spectrum of each card. This is orientation indipendent to the nature of the 2D FFT. It is important to note that each of the four suits (spades, hearts, clubs, diamonds) have different shapes and thus have different 2D FFTs. In addition the number of “pips” on each card will change the 2D FFT for the card as well thus giving each card a unique 2D FFT.
- Normalize the 2D FFT of the EA.
- This set of 2D FFTs on the EA will be known as the gold standard (GS) or template for each of the matching tests.
Check the desired card against the template:
- Record 10k events for the unknown card
- Make a 2D unknown event array (UEA) where the x,y indices are the pixel locations. The x-y values at the indices are the number of times an event occurred at that pixel within the 10k events.
- Take the 512 point 2D FFT of the UEA.
- Normalize the 2D FFT of the UEA.
- Find the error between the UEA and each of the 52 EAs.
- The EA that has the lowest error (i.e. highest correlation) is the unknown card.
An example of the 2D FFTs of playing cards can be found below. The cards shown below are (from top left to bottom right) the 2 of clubs (2C), 2 of spades (2S), 2 of hearts (2H), and the 2 of diamonds (2D). It is important to note the unique shape of each of the 2D FFTs is caused by the different shapes of the suit. Suits with a triangular point (such as spades, hears, and diamonds) will have higher frequency content then more rounded suits such as clubs. Rounded suits will have lower frequency content. Spades and hearts have both rounded and pointed content so they have both high and low frequency content.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s2/image003.jpg

Results: The above method was tested with a GS created from two separate decks of cards. System 2 was also tested with a Gaussian blur applied to the GS. This blur ranged from size 1 to size 5. A test set of 104 cards (different from those used to create the GS) were used to test the identification accuracy. Below is a plot of the percent correct identification vs. Gaussian blur.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s2/image004.jpg

It was found that there was little difference with a blur ranging from 1 to 3. The average correct identification is 85%. Conclusion: It was found that System 2 was able to consistently identify the correct playing card out of a deck of 52 with a 85% accuracy.







Using a silicon retina for playing card recognition
“System 1: FFT based”
(Created and conducted at the 2010 Neuromorphic Workshop in Telluride, CO)
2010-07-15

Using a silicon retina for playing card recognition. The goal of this project was to create a system that uses the biologically inspired DVS-128 silicon retina (SR) to detect a playing card and determine the card’s value and suit. The DVS-128 is an event based visual system that detects changes in relative intensity in a live image. An important attribute of this system is that it is asynchronous meaning that it sends event data (changes in the image) back as it happens and not at a fixed frame rate. The silicon retina only detects changes in relative intensity which means it will only “see” moving or changing objects. Thus in order for the system to see the playing card the image needs to be moving. This is similar to how our eyes work. If we stair at an unchanging scene for an extended period of time without eye movement the scene will be perceived as becoming dulled then it will fade away. Our eye’s saccade (quick simultaneous movements of both eyes in the same direction) is believed to help elevate this. To keep consistency it is desired to have the playing card move in a constant direction with a constant velocity or constant acceleration. In addition the silicon retina should be at a fixed location for all of the tests. In this case the DVS-128 silicon retina (lends: 12mm 2.8f) was placed 19cm away from the card. The silicon retina is directly facing the card. The card slides down the vertical slide as seen in the picture below. The card slide makes the cards direction constant and the entry velocity and acceleration consistent between each run. The picture below depicts the system (System 1).

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image001.jpg

The card slides down the slide and passes in front of the silicon retina. The change in intensity that occurs, as read by the silicon retina, creates an image in three dimensions, the x and y dimensions as well as in time (t). The picture below shows the output of the silicon retina in respect to time. The bottom two axes are x and y values and the vertical axis is time. Each dot represents an event at the specified pixel location. The silicon retina has a resolution of 128 by 128. During operation in a given time span there are multiple events occurring at the same pixel location. Below is the graph of the time span when the card is dropped and passes through the field of view of the silicon retina. Below is a plot of the x-y coordinates of the silicon retina (the two horizontal axes) vs. time (the vertical axis). Each blue dot represents an event (as described above).

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image002.jpg

The location in time (t) where there are significantly more events is the period of time when the card fell through the silicon retina’s field of view. The other sparse events are noise. A close up of the period of time when the card past the silicon retina’s field of view can be found below.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image003.jpg

It is plane to see that there is structure in the event period (EP). Further examination of the falling trajectory will help us identify the card. Below is a plot of the number of events occurring on the x-y plane during the EP. The color scale is normalized where red is a large number of events and blue is a low number of events.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image004.jpg

In the above plot there is a well defined strip down the center of the x plane. This denotes that there is one column of “pips” (also known as card suits symbols).

Below is the y-t plot.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image005.jpg

There are two defined stripes in the y-t plot. This shows that there are two columns of “pips” in the card that was dropped. The number of rows and the number of columns present can help us discern the card number (ace through ten). Below is an example of a card that only shows with one column and two rows, it is a two.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image006.jpg

Next is a card that shows one column and one row, it is an ace

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image007.jpg

Next is a card that shows three columns and five rows, it is an eight. The eight and nine are special cases because they both have three columns and five rows. The density of events occurring in the center column will determine weather it is an eight or a nine. The lower density as compared to the side two columns is the nine.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image008.jpg

Below is a chart of all of definitions of the cards per number of columns and rows present.

http://dasl.mem.drexel.edu/~danLofaro/events/telluride/2010/s1/image009.jpg

Conclusion:

Overall this method worked however it was unable to detect face cards and suite. This method did show that the silicon retina can determine the card value (ace through 10) while the card is only visible for less then 80ms.