Ethernet vol2 : a look to the RGMII-connection

Get complete sources from ac_inout_psu/source/system_control/system_components/ethernet/

Now with the link up and running, as indicated by the data accessed through the MDIO, the next thing to do is to catch the ethernet frame from the ethernet cable through phy RGMII connection. RGMII is a 12 io source synchronous interface consisting of separate RX and TX clock an enable/error io and 4 data IO per direction.

Figure 1. RGMII connections in the control board pcb layout design. The ic labeled “49” is the ethernet phy chip. The spaghetti in the traces is used to match the data trace lengths.

The io data is read directly with no extra decoding and the enable/error (ctl) indicates when valid data is present in the RGMII interface.

RGMII clocking

Data through the RGMII is framed with the corresponding rx or tx clock in source synchronous dual-data rate mode. The dual-data rate io clocking is accomplished by dedicating a PLL to the ethernet connection. The ethernet clocks are generated with the FPGA PLL from rx reference clock generated by the PHY. Since data is captured at the center of the reference clock, the PLL is configured such that the rx IO capture clock is phase shifted by half data period, or 90 degrees to align the capture clock to the center of the data bit. For the transmitter side, a reference clock is phase shifted by 90 degrees and sent to the phy. The transmitter logic is clocked with the same clock as the receiver which allows easy transfer of data between receiver and transmitter modules inside the FPGA. In the hardware implementation, the PLL has extra phase shift to allow for compensating the PCB trace skew.

RGMII implementation

The data bits are read through the rgmii with ddr io buffers which are an integrated feature in each FPGA io in most FPGAs. The ethernet receiver is divided into a IO interface module called ethernet_rx_ddio and an ethernet frame receiver module.

The ethernet ethernet_rx_ddio module encapsulates the RGMII interface with the following interface description

				
					package ethernet_rx_ddio_pkg is
------------------------------------------------------------------------
    type ethernet_rx_ddio_FPGA_input_group is record
        rx_ctl : std_logic;
        ethernet_rx_ddio_in : std_logic_vector(3 downto 0);
    end record;
     
------------------------------------------------------------------------
    type ethernet_rx_ddio_data_output_group is record
        rx_ctl : std_logic_vector(1 downto 0);
        ethernet_rx_byte : std_logic_vector(7 downto 0);
    end record;
     
------------------------------------------------------------------------
    component ethernet_rx_ddio is
        port (
            ethernet_rx_ddio_clocks   : in ethernet_rx_ddr_clock_group;
            ethernet_rx_ddio_FPGA_in  : in ethernet_rx_ddio_FPGA_input_group;
            ethernet_rx_ddio_data_out : out ethernet_rx_ddio_data_output_group
        );
    end component ethernet_rx_ddio;
------------------------------------------------------------------------
    function get_byte ( ethernet_rx_output : ethernet_rx_ddio_data_output_group)
        return std_logic_vector;
------------------------------------------------------------------------
    function get_reversed_byte (
        ethernet_rx_output : ethernet_rx_ddio_data_output_group)
        return std_logic_vector;
------------------------------------------------------------------------
    function ethernet_rx_is_active ( ethernet_rx_ddr_output : ethernet_rx_ddio_data_output_group)
        return boolean;
------------------------------------------------------------------------
end package ethernet_rx_ddio_pkg;
				
			

The module implementation is simply instantiation of the ethddio_rx which is a quartus IP core for FPGA ddr buffer.

				
					architecture cl10_rx_ddio of ethernet_rx_ddio is
 
    alias ddio_rx_clock is ethernet_rx_ddio_clocks.rx_ddr_clock;
    signal ddio_fpga_in : std_logic_vector(4 downto 0);
    alias ethernet_byte_to_fpga is ethernet_rx_ddio_data_out.ethernet_rx_byte;
 
    component ethddio_rx IS
    PORT
    (
        datain    : IN STD_LOGIC_VECTOR (4 DOWNTO 0);
        inclock   : IN STD_LOGIC ;
        dataout_h : OUT STD_LOGIC_VECTOR (4 DOWNTO 0);
        dataout_l : OUT STD_LOGIC_VECTOR (4 DOWNTO 0)
    );
    END component;
 
 
    signal dataout_h : STD_LOGIC_VECTOR (4 DOWNTO 0);
    signal dataout_l : STD_LOGIC_VECTOR (4 DOWNTO 0);
 
------------------------------------------------------------------------
begin
 
    ddio_fpga_in(4) <= ethernet_rx_ddio_FPGA_in.rx_ctl;
    ddio_fpga_in(3 downto 0) <= ethernet_rx_ddio_FPGA_in.ethernet_rx_ddio_in;
 
    ethernet_rx_ddio_data_out <= (rx_ctl           => dataout_l(4)          & dataout_h(4),
                                  ethernet_rx_byte => dataout_l(3 downto 0) & dataout_h(3 downto 0));
 
------------------------------------------------------------------------
    u_ethddio : ethddio_rx
        PORT map(
            ddio_fpga_in  ,
            ddio_rx_clock ,
            dataout_h     ,
            dataout_l
        );
 
------------------------------------------------------------------------
end cl10_rx_ddio;
				
			

With the RGMII ddr interface routines, the logic which captures data from the RGMII is simply if data is valid then get byte from ethernet rx ddr. This is accomplished in the ethernet_frame_receiver as follows

				
					entity ethernet_frame_receiver is
    port (
        ethernet_frame_receiver_clocks   : in ethernet_rx_ddr_clock_group;
        ethernet_frame_receiver_FPGA_in  : in ethernet_frame_receiver_FPGA_input_group;
        ethernet_frame_receiver_data_in  : in ethernet_frame_receiver_data_input_group;
        ethernet_frame_receiver_data_out : out ethernet_frame_receiver_data_output_group
    );
end entity ethernet_frame_receiver;
 
architecture rtl of ethernet_frame_receiver is
 
    alias rx_ddr_clock is ethernet_frame_receiver_clocks.rx_ddr_clock; 
 
    signal ethernet_rx_ddio_data_out  : ethernet_rx_ddio_data_output_group;
    signal ethernet_rx : ethernet_receiver;
 
------------------------------------------------------------------------
begin
 
    ethernet_frame_receiver_data_out <= (
                                            test_data                    => ethernet_rx.test_data,
                                            data_has_been_written_when_1 => ethernet_rx.data_has_been_written_when_1
                                        );
 
------------------------------------------------------------------------
    frame_receiver : process(rx_ddr_clock) 
 
    begin
        if rising_edge(rx_ddr_clock) then
 
            ethernet_rx.rx_shift_register <= ethernet_rx.rx_shift_register(7 downto 0) & get_byte(ethernet_rx_ddio_data_out); 
 
            if ethernet_rx_is_active(ethernet_rx_ddio_data_out) then
                capture_ethernet_frame(ethernet_rx, ethernet_rx_ddio_data_out); 
 
            else
                idle_ethernet_rx(ethernet_rx);
 
            end if; 
 
        end if; --rising_edge
    end process frame_receiver; 
 
------------------------------------------------------------------------
    u_ethernet_rx_ddio : ethernet_rx_ddio
    port map( ethernet_frame_receiver_clocks                           ,
              ethernet_frame_receiver_FPGA_in.ethernet_rx_ddio_FPGA_in ,
              ethernet_rx_ddio_data_out);
 
------------------------------------------------------------------------
end rtl;
				
			

The phy allows any data to be transmitted through the cable, but using ethernet frame is what allows the connection to be universal.

Ethernet frame capture from the RGMII

The ethernet frame has six main components: a start of frame delimiter, destination MAC address, source MAC address, type field, payload and the frame check sequence. Since the frame receiver is only responsible for buffering the ethernet frame obtained from from the RGMII interface, only the start of frame and frame check sequence are concerned and the frame processing is left for other modules.

The correct frame start point is deducted from the start of frame pattern and the end of frame is detected from the transmission being ready as indicated by ethernet_rx_is_active returning false. The frame receiver thus does not need to know what is the length of the frame, nor does it need to parse any data from the frame.

The data bits of the ethernet frame data are framed in octets or a group 8 data bits which are transmitted at the io clock rate in gigabit mode. The ethernet standard specifies that the octets are transmitted pairwise in alternating order, meaning that a byte pattern “AB|CD|EF” in the ethernet frame is transmitted as “BA|DC|FE”. The bits are also sent with least significant bit first, therefore the ethernet frame is captured with its bits swapped and ordered from lsb to msb in 4 bit groups

				
					function get_reversed_byte
(
    ethernet_rx_output : ethernet_rx_ddio_data_output_group
)
return std_logic_vector
is
    variable byte_reversed : std_logic_vector(7 downto 0);
begin
    byte_reversed := ethernet_rx_output.ethernet_rx_byte(4) &
                     ethernet_rx_output.ethernet_rx_byte(5) &
                     ethernet_rx_output.ethernet_rx_byte(6) &
                     ethernet_rx_output.ethernet_rx_byte(7) &
                     ethernet_rx_output.ethernet_rx_byte(0) &
                     ethernet_rx_output.ethernet_rx_byte(1) &
                     ethernet_rx_output.ethernet_rx_byte(2) &
                     ethernet_rx_output.ethernet_rx_byte(3);
    return byte_reversed; 
end get_reversed_byte;
				
			

The ethernet frame receiver is implemented with ethernet rx object which encapsulates the frame receiver functionalities. The ethernet frame receiver has the following interface description

				
					package ethernet_frame_receiver_internal_pkg is
 
    constant ethernet_fcs_checksum    : std_logic_vector(31 downto 0) := x"c704dd7b";
    constant ethernet_frame_delimiter : std_logic_vector(7 downto 0)  := x"AB";
    constant ethernet_frame_preamble  : std_logic_vector(15 downto 0) := x"AAAA";
 
    type list_of_frame_receiver_states is (wait_for_start_of_frame, receive_frame);
 
    type ethernet_receiver is record
        frame_receiver_state         : list_of_frame_receiver_states;
        rx_shift_register            : std_logic_vector(15 downto 0);
        data_has_been_written_when_1 : std_logic;
        fcs_shift_register           : std_logic_vector(31 downto 0);
 
        test_data                    : bytearray;
        bytearray_index_counter      : natural range 0 to bytearray'high;
    end record;
 
------------------------------------------------------------------------
    procedure capture_ethernet_frame (
        signal ethernet_rx : inout ethernet_receiver;
        ethernet_ddio_out : ethernet_rx_ddio_data_output_group);
 
------------------------------------------------------------------------
    procedure idle_ethernet_rx (
        signal ethernet_rx : inout ethernet_receiver);
------------------------------------------------------------------------
    procedure calculate_fcs (
        signal ethernet_rx : inout ethernet_receiver;
        ethernet_ddio_out : ethernet_rx_ddio_data_output_group);
 
end package ethernet_frame_receiver_internal_pkg;
				
			

The frame receive has 2 states, which are wait_for_start_of_frame and receive_frame. Start of an ethernet frame is detected by looking for the ethernet_frame_delimiter “AB” in the ddr data buffer. For testing purposes, the received frame is buffered to a byte array which is then transmitted with uart to PC and printed to console.

				
					------------------------------------------------------------------------
    procedure capture_ethernet_frame
    (
        signal ethernet_rx : inout ethernet_receiver;
        ethernet_ddio_out : ethernet_rx_ddio_data_output_group
    ) is
        alias frame_receiver_state         is ethernet_rx.frame_receiver_state        ;
        alias rx_shift_register            is ethernet_rx.rx_shift_register           ;
        alias test_data                    is ethernet_rx.test_data                   ;
        alias data_has_been_written_when_1 is ethernet_rx.data_has_been_written_when_1;
        alias bytearray_index_counter      is ethernet_rx.bytearray_index_counter     ;
        alias fcs_shift_register           is ethernet_rx.fcs_shift_register          ;
 
    begin
 
        CASE frame_receiver_state is
            WHEN wait_for_start_of_frame =>
                if rx_shift_register = ethernet_frame_preamble and get_byte(ethernet_ddio_out) = ethernet_frame_delimiter  then
                    frame_receiver_state <= receive_frame;
                end if;
 
            WHEN receive_frame =>
 
                if bytearray_index_counter < bytearray'high then
                    bytearray_index_counter <= bytearray_index_counter + 1;
 
                    test_data(bytearray_index_counter) <= get_reversed_byte(ethernet_ddio_out);
                end if; 
 
                calculate_fcs(ethernet_rx, ethernet_ddio_out); 
 
        end CASE;
    end capture_ethernet_frame;
				
			

The last part of the ethernet frame receiver is the frame check sequence.

Frame Check Sequence / CRC32

The frame check sequence is a 32 bit crc. The idea behind a CRC is that it is an error detector, which calculates a checksum over the ethernet frame including the CRC at the end of the frame. If the checksum is not found, then there were errors in the received frame.

Due to the bit and byte swapping in the frame transmission the crc computation is already confusing, but as an added bonus it is also described in the ethernet standard to be transmitted with highest bit first thus in opposite order compared to the data. To avoid the need to swap the crc bits, the crc is calculated with the bits in reverse order compared to the data as this produces a checksum 0xc704dd7b that is then used to verify correctly captured frame. For this reason, the ethernet_rx_ddio module has both get byte and get reverse byte functions.

Since there can be footers, or valid data bits after the frame, the CRC function is run until the checksum is found

				
					procedure calculate_fcs
(
    signal ethernet_rx : inout ethernet_receiver;
    ethernet_ddio_out : ethernet_rx_ddio_data_output_group
) is
    alias fcs_shift_register is ethernet_rx.fcs_shift_register;
begin
    if fcs_shift_register /= ethernet_fcs_checksum then
        fcs_shift_register <= nextCRC32_D8(get_byte(ethernet_ddio_out), fcs_shift_register);
    end if;
 
end calculate_fcs;
				
			

The function nextCRC32_D8 is obtained from an absolutely fantastic tool that simply produces a VHDL function that calculates the CRC with given number of input bits!

https://www.easics.com/crctool/

Since ethernet frame capture through the RGMII needs no further functionality, it is next tested with hardware.

Test with FPGA

The test code in system_components is modified from the mdio test by first transmitting 128 octets from the ethernet module and then the mdio registers

				
					--------------------------------------------------
    test_with_uart : process(clock)
 
        --------------------------------------------------
        function get_square_wave_from_counter
        (
            counter_value : integer
        )
        return int18
        is
        begin
            if counter_value > 32767 then
                return 55e3;
            else
                return 15e3;
            end if;
        end get_square_wave_from_counter;
        --------------------------------------------------
 
        variable register_counter : natural range 0 to 31 := 0;
         
    begin
        if rising_edge(clock) then
 
            create_bandpass_filter(bandpass_filter);
 
            init_mdio_driver(mdio_driver_data_in);
 
            idle_adc(spi_sar_adc_data_in);
            init_uart(uart_data_in);
            receive_data_from_uart(uart_data_out, uart_rx_data);
            system_components_FPGA_out.test_ad_mux <= integer_to_std(number_to_be_converted => uart_rx_data, bits_in_word => 3);
 
            uart_transmit_counter <= uart_transmit_counter - 1; 
            if uart_transmit_counter = 0 then
                uart_transmit_counter <= counter_at_100khz;
                start_ad_conversion(spi_sar_adc_data_in); 
            end if; 
 
            if ad_conversion_is_ready(spi_sar_adc_data_out) then
 
                CASE uart_rx_data is
                    WHEN 10 => transmit_16_bit_word_with_uart(uart_data_in, get_filter_output(bandpass_filter.low_pass_filter) );
                    WHEN 11 => transmit_16_bit_word_with_uart(uart_data_in, (bandpass_filter.low_pass_filter.filter_input - get_filter_output(bandpass_filter.low_pass_filter))/2+32768);
                    WHEN 12 => transmit_16_bit_word_with_uart(uart_data_in, get_filter_output(bandpass_filter)/2+32768);
                    WHEN 13 => transmit_16_bit_word_with_uart(uart_data_in, bandpass_filter.low_pass_filter.filter_input - get_filter_output(bandpass_filter));
                    WHEN 14 => transmit_16_bit_word_with_uart(uart_data_in, get_adc_data(spi_sar_adc_data_out));
                    WHEN 15 => transmit_16_bit_word_with_uart(uart_data_in, uart_rx_data);
                    WHEN others => -- get data from MDIO
                        register_counter := register_counter + 1;
                        if test_counter = 100 then
                            -- write_data_to_mdio(mdio_driver_data_in, x"00", x"00", force_1000MHz_connection);
                        else
                            read_data_from_mdio(mdio_driver_data_in, x"00", integer_to_std(register_counter, 8));
                        end if;
                end CASE; 
 
                filter_data(bandpass_filter, get_square_wave_from_counter(test_counter));
                test_counter <= test_counter + 1; 
                if test_counter = 65535 then
                    test_counter <= 0;
                end if;
            end if;
 
            if mdio_data_read_is_ready(mdio_driver_data_out) then
                if test_counter < 64+32 then
                        transmit_16_bit_word_with_uart(uart_data_in, get_data_from_mdio(mdio_driver_data_out));
                end if;
 
                if test_counter < 64 then
 
                        transmit_16_bit_word_with_uart(uart_data_in, 
                                                       ethernet_data_out.ethernet_frame_receiver_data_out.test_data(test_counter*2+1) & 
                                                       ethernet_data_out.ethernet_frame_receiver_data_out.test_data(test_counter*2));
                end if;
 
            end if;
 
        end if; --rising_edge
    end process test_with_uart; 
				
			

The hardware test setup is remarkably simple since just by connecting a cable between a computer and the control board, the computer will try to resolve the link partner by transmitting address resolution protocol frames to it.

The captured ethernet frame is transmitted with uart and printed to console. Data checking is accomplished by using a crc to verify the checksum of the captured ethernet frame. 128 octets are printed to console and correct FCS is detected by padding the empty registers with “DD” if checksum was not found and with “EE” when checksum is calculated from the ethernet frame.

				
					------------------------------------------------------------------------
    procedure idle_ethernet_rx
    (
        signal ethernet_rx : inout ethernet_receiver
         
    ) is
        alias frame_receiver_state         is ethernet_rx.frame_receiver_state        ;
        alias rx_shift_register            is ethernet_rx.rx_shift_register           ;
        alias test_data                    is ethernet_rx.test_data                   ;
        alias data_has_been_written_when_1 is ethernet_rx.data_has_been_written_when_1;
        alias bytearray_index_counter      is ethernet_rx.bytearray_index_counter     ;
        alias fcs_shift_register           is ethernet_rx.fcs_shift_register          ;
    begin
        if bytearray_index_counter > 0 and bytearray_index_counter /= bytearray'high then
            bytearray_index_counter <= bytearray_index_counter + 1;
 
            if ethernet_rx.fcs_shift_register = ethernet_fcs_checksum then
                test_data(bytearray_index_counter) <= x"EE";
            else
                test_data(bytearray_index_counter) <= x"dd";
            end if;
        else
            bytearray_index_counter <= 0;
            fcs_shift_register <= (others => '1');
 
            frame_receiver_state <= wait_for_start_of_frame;
        end if;
 
         
    end idle_ethernet_rx;
				
			

The console application prints the received octets in groups of 2, or 16 bits and prints them ordered with msb to lsb. Thus the received octets are read from right to left and up to down. Rows 0 – 2F are the ethernet frame data, 30 – 3F are the padded bytes which when “EE” indicates a correctly received frame and the last 32 registers 0x40 to 0x5F are the MDIO registers.

Figure 3. Correctly captured ethernet frame, with checksum 0x103ff6f6

The valid frame bytes in the figure are ff ff ff ff ff ff c4 65 16 ae 5e 4f 08 00 45 00 00 4e 97 49 00 00 80 11 fc a7 a9 fe 52 b1 a9 fe ff ff 00 89 00 89 00 3a f7 a8 f0 9b 01 10 00 01 00 00 00 00 00 00 20 45 45 45 42 45 4f 45 47 45 50 46 44 46 44 43 41 43 41 43 41 43 41 43 41 43 41 43 41 43 41 42 4d 00 00 20 00 01 6f 6f f3 01 which can be put to a crc calculator to verify the frame, for example in

http://www.sunshine2k.de/coding/javascript/crc/crc_js.html

The VHDL code checks for 0x704dd7b which can be obtained by copy&pasting the valid bytes, unchecking the result reflected and putting 0 to final xor value in the above crc calculator.

As experience has shown, the frame capture though relatively simple, is quite wonky to get correct since there are quite a few places where the bit and byte order can be reversed unintentionally. Since there is some bit and byte reordering present in the uart also, it is quite easy to get the correct result with several bugs just cancelling each other out or having a wrong result that looks correct. If the data is read from the ddr buffer at wrong clock edge, or if the pll phase shift is not correct the data is captured out of phase.

With ethernet frame now received correctly, the next thing to do is to buffer the ethernet frame to dual port ram which then allows the ethernet frame to be processed by a minimal amount of higher layer protocols.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top