Enevlope Detection using FPGA board

Ahmed Alfadhel · July 24, 2019

Hi ,

I am using ARTY 7, with Pmod DA3 .

I generated frequency hops (FH) signal then I used these hops to modulate a BFSK (Binary Frequency Shift Keying) signal .

Currently I am trying to recover the envelope of the BFSK signal. But I don't now how to do that by using the FPGA board. I did a google search , and I found some people are using Hilbert Transform (HT) to recover the envelope of a modulated signal. But this technique (HT) seems need more illustration about how to implement it by FPGA.

The question is : how to detect the envelop of a modulated signal using an FPGA board?

Kindly, see the attached two pictures; the first one is illustrating envelop detection process, and the second one is illustrating the block diagram of the system that I built on my FPGA .

Thanks.

hamster · July 31, 2019

I was intrigued enough by the Hilbert Transform to actually learn, experiment and implement it in VHDL. The math behind it is pretty nifty.

Here's the very naively implemented example, using a short FIR filter.

You can find this, a test bench and simulation output at http://hamsterworks.co.nz/mediawiki/index.php/Hilbert_Transform

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity hilbert_transformer is
    Port ( clk      : in  STD_LOGIC;
           real_in  : in  STD_LOGIC_VECTOR (9 downto 0);
           real_out : out  STD_LOGIC_VECTOR (10 downto 0) := (others => '0');
           imag_out : out  STD_LOGIC_VECTOR (10 downto 0) := (others => '0'));
end hilbert_transformer;

architecture Behavioral of hilbert_transformer is
   -- Constants are 2/(n * pi) * 512, for n of -7,-5,-3,-1,1,3,5,7
   constant kernel0  : signed(real_in'length-1 downto 0) := to_signed( -47, real_in'length);
   constant kernel2  : signed(real_in'length-1 downto 0) := to_signed( -66, real_in'length);
   constant kernel4  : signed(real_in'length-1 downto 0) := to_signed(-109, real_in'length);
   constant kernel6  : signed(real_in'length-1 downto 0) := to_signed(-326, real_in'length);
   constant kernel8  : signed(real_in'length-1 downto 0) := to_signed( 326, real_in'length);
   constant kernel10 : signed(real_in'length-1 downto 0) := to_signed( 109, real_in'length);
   constant kernel12 : signed(real_in'length-1 downto 0) := to_signed(  66, real_in'length);
   constant kernel14 : signed(real_in'length-1 downto 0) := to_signed(  47, real_in'length);

   type a_delay is array (0 to 14) of signed(real_in'high downto 0);

   signal delay : a_delay := (others => (others => '0'));
   signal tap0  : signed(real_in'length+kernel0'length-1  downto 0) := (others => '0');
   signal tap2  : signed(real_in'length+kernel2'length-1  downto 0) := (others => '0');
   signal tap4  : signed(real_in'length+kernel4'length-1  downto 0) := (others => '0');
   signal tap6  : signed(real_in'length+kernel6'length-1  downto 0) := (others => '0');
   signal tap8  : signed(real_in'length+kernel8'length-1  downto 0) := (others => '0');
   signal tap10 : signed(real_in'length+kernel10'length-1 downto 0) := (others => '0');
   signal tap12 : signed(real_in'length+kernel12'length-1 downto 0) := (others => '0');
   signal tap14 : signed(real_in'length+kernel14'length-1 downto 0) := (others => '0');
   
begin

process(clk) 
   variable imag_tmp : signed(real_in'length*2-1 downto 0);
   begin
      if   rising_edge(clk) then 
         
         real_out <= std_logic_vector(resize(delay(8),real_out'length));  -- deliberatly advanced by one due to latency
         
         imag_tmp := tap0 + tap2  + tap4  + tap6 
                   + tap8 + tap10 + tap12 + tap14;
         imag_out <= std_logic_vector(imag_tmp(imag_tmp'high downto imag_tmp'high-imag_out'high));
         
         tap0  <= delay(0)  * kernel0;
         tap2  <= delay(2)  * kernel2;
         tap4  <= delay(4)  * kernel4;
         tap6  <= delay(6)  * kernel6;
         tap8  <= delay(8)  * kernel8;
         tap10 <= delay(10) * kernel10;
         tap12 <= delay(12) * kernel12;
         tap14 <= delay(14) * kernel14;
         
         -- Update the delay line 
         delay(1 to 14) <= delay(0 to 13) ;
         delay(0)       <= signed(real_in);
      end if;
   end process;
end Behavioral;

xc6lx45 · August 3, 2019

On 8/1/2019 at 11:56 PM, hamster said:

Using a tool for what it is meant to do is easy. Using a tool for something where it isn't suited, that is where the learning begins!

That's the spirit

I'm just commenting because Hilbert transform looks like a wonderful tool for its conceptual simplicity. And textbooks get carried away on it. And of course it does have valid technical applications. But it can easily turn into the steamroller approach to making apple puree, and DSP tends to become unforgiving a few minutes into the game when "implementation effort" plays against "signal quality" on linear vs logarithmic scales.

At the end of the day, it boils down to the same idea - elimination of the negative frequencies so cos(omega t) = 1/2 (exp(- i omega t) + exp(i omega t)) becomes constant-envelope exp(i omega t)

Ahmed Alfadhel · August 2, 2019

10 hours ago, hamster said:

The error is because the magnitude as to be one bit longer than the inputs, (as the magnitude of (0xFFFFFF, 0xFFFFFF) is 0x16A09E4, which will overflow if you put it into a 25-bit signed value.

Hi @hamster, I increased the width of the ports and the signals to 28 bits and the error is still existed !

I am looking forward your reply.

Thanks.

hamster · August 1, 2019

1 hour ago, Ahmed Alfadhel said:

Hi ,

Thank u @hamster for your elaboration about Hilbert Transform. I modified your code to work with my design , as follow:


--Hilbert Transformer
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity hilbert_transformer is
    Port ( clk      : in  STD_LOGIC;
           real_in  : in  STD_LOGIC_VECTOR (23 downto 0);
           real_out : out  STD_LOGIC_VECTOR (24 downto 0) := (others => '0');
           imag_out : out  STD_LOGIC_VECTOR (24 downto 0) := (others => '0'));
end hilbert_transformer;

architecture Behavioral of hilbert_transformer is
   -- Constants are 2/(n * pi) * 512, for n of -7,-5,-3,-1,1,3,5,7
   constant kernel0  : signed(real_in'length-1 downto 0) := to_signed( -47, real_in'length);
   constant kernel2  : signed(real_in'length-1 downto 0) := to_signed( -66, real_in'length);
   constant kernel4  : signed(real_in'length-1 downto 0) := to_signed(-109, real_in'length);
   constant kernel6  : signed(real_in'length-1 downto 0) := to_signed(-326, real_in'length);
   constant kernel8  : signed(real_in'length-1 downto 0) := to_signed( 326, real_in'length);
   constant kernel10 : signed(real_in'length-1 downto 0) := to_signed( 109, real_in'length);
   constant kernel12 : signed(real_in'length-1 downto 0) := to_signed(  66, real_in'length);
   constant kernel14 : signed(real_in'length-1 downto 0) := to_signed(  47, real_in'length);

   type a_delay is array (0 to 14) of signed(real_in'high downto 0);

   signal delay : a_delay := (others => (others => '0'));
   signal tap0  : signed(real_in'length+kernel0'length-1  downto 0) := (others => '0');
   signal tap2  : signed(real_in'length+kernel2'length-1  downto 0) := (others => '0');
   signal tap4  : signed(real_in'length+kernel4'length-1  downto 0) := (others => '0');
   signal tap6  : signed(real_in'length+kernel6'length-1  downto 0) := (others => '0');
   signal tap8  : signed(real_in'length+kernel8'length-1  downto 0) := (others => '0');
   signal tap10 : signed(real_in'length+kernel10'length-1 downto 0) := (others => '0');
   signal tap12 : signed(real_in'length+kernel12'length-1 downto 0) := (others => '0');
   signal tap14 : signed(real_in'length+kernel14'length-1 downto 0) := (others => '0');
   
begin

process(clk) 
   variable imag_tmp : signed(real_in'length*2-1 downto 0);
   begin
      if   rising_edge(clk) then 
         
         real_out <= std_logic_vector(resize(delay(8),real_out'length));  -- deliberatly advanced by one due to latency
         
         imag_tmp := tap0 + tap2  + tap4  + tap6 
                   + tap8 + tap10 + tap12 + tap14;
         imag_out <= std_logic_vector(imag_tmp(imag_tmp'high downto imag_tmp'high-imag_out'high));
         
         tap0  <= delay(0)  * kernel0;
         tap2  <= delay(2)  * kernel2;
         tap4  <= delay(4)  * kernel4;
         tap6  <= delay(6)  * kernel6;
         tap8  <= delay(8)  * kernel8;
         tap10 <= delay(10) * kernel10;
         tap12 <= delay(12) * kernel12;
         tap14 <= delay(14) * kernel14;
         
         -- Update the delay line 
         delay(1 to 14) <= delay(0 to 13) ;
         delay(0)       <= signed(real_in);
      end if;
   end process;
end Behavioral;

and


-- magnitude IP
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity magnitude is
    Port ( 
        clk           : in std_logic;
        x_in          : in std_logic_vector(24 downto 0);
        y_in          : in std_logic_vector(24 downto 0);
        x_out         : out std_logic_vector(24 downto 0) := (others => '0');
        y_out         : out std_logic_vector(24 downto 0) := (others => '0');
        magnitude_out : out std_logic_vector(24 downto 0) := (others => '0') -- Accurate to 5 bits or so
    );
end magnitude;

architecture Behavioral of magnitude is

    type a_x is array(0 to 5) of signed(x_in'high+1 downto 0);
    type a_y is array(0 to 5) of signed(y_in'high+1 downto 0);
    type a_x_delay is array(0 to 5) of std_logic_vector(x_in'high downto 0);
    type a_y_delay is array(0 to 5) of std_logic_vector(y_in'high downto 0);
   
-- line 23 (error occured here)
    signal x : a_x(24 downto 0) := (others => (others => '0'));
    signal y : a_y(24 downto 0) := (others => (others => '0'));
    signal x_delay : a_x_delay(24 downto 0) := (others => (others => '0'));
    signal y_delay : a_y_delay(24 downto 0) := (others => (others => '0'));
    
begin

    magnitude_out <= std_logic_vector(y(5));
    x_out <= x_delay(x_delay'high);
    y_out <= y_delay(y_delay'high);

process(clk)
    begin
        if rising_edge(clk) then
            if x(4) >= 0 then
                -- x(5) is not needed
                y(5) <= y(4) + x(4)(x(4)'high downto 4);
            else
                -- x(5) is not needed
                y(5) <= y(4) - x(4)(x(4)'high downto 4);
            end if;
            
            if x(3) >= 0 then
                x(4) <= x(3) - y(3)(y(3)'high downto 3);
                y(4) <= y(3) + x(3)(x(3)'high downto 3);
            else
                x(4) <= x(3) + y(3)(y(3)'high downto 3);
                y(4) <= y(3) - x(3)(x(3)'high downto 3);
            end if;
            
            if x(2) >= 0 then
                x(3) <= x(2) - y(2)(y(2)'high downto 2);
                y(3) <= y(2) + x(2)(x(2)'high downto 2);
            else
                x(3) <= x(2) + y(2)(y(2)'high downto 2);
                y(3) <= y(2) - x(2)(x(2)'high downto 2);
            end if;
            
            if x(1) >= 0 then
                x(2) <= x(1) - y(1)(y(1)'high downto 1);
                y(2) <= y(1) + x(1)(x(1)'high downto 1);
            else
                x(2) <= x(1) + y(1)(y(1)'high downto 1);
                y(2) <= y(1) - x(1)(x(1)'high downto 1);
            end if;
            
            if x(0) >= 0 then
                x(1) <= x(0) - y(0)(y(0)'high downto 0);
                y(1) <= y(0) + x(0)(x(0)'high downto 0);
            else
                x(1) <= x(0) + y(0)(y(0)'high downto 0);
                y(1) <= y(0) - x(0)(x(0)'high downto 0);
            end if;
            
            if y_in(y_in'high) = '1' then
                x(0) <= signed(x_in(x_in'high) & x_in);
                y(0) <= signed(to_signed(0,y_in'length+1)-signed(y_in));
            else
                x(0) <= signed(x_in(x_in'high) & x_in);
                y(0) <= signed(y_in(y_in'high) & y_in);
            end if;
            
            -- Delay to output the inputs, so they are aligned with the magnitudes
            x_delay(1 to 5) <= x_delay(0 to 4);
            y_delay(1 to 5) <= y_delay(0 to 4);
            x_delay(0) <= x_in;
            y_delay(0) <= y_in;
        end if;
    end process;
 


end Behavioral;

and I used my own test bench :


library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.numeric_std.all;
use std.textio.all ;
use ieee.std_logic_textio.all ;

-- Entity
entity FHSS_TX_Test_Bench_sim is
end;

-- Architecture
architecture test of FHSS_TX_Test_Bench_sim is
-- Our UART Transmitter Design Instantiation
component FH_modem_wrapper
port (
  BFSK : out STD_LOGIC_VECTOR ( 7 downto 0 );
  FH : out STD_LOGIC_VECTOR ( 7 downto 0 );
  spreaded_signal : out STD_LOGIC_VECTOR ( 7 downto 0 );
  despreaded : out STD_LOGIC_VECTOR ( 7 downto 0 );
  IF_BPF : out STD_LOGIC_VECTOR (23 downto 0);
  x : out STD_LOGIC_VECTOR ( 24 downto 0 );
  y : out STD_LOGIC_VECTOR ( 24 downto 0 );
  mag_out : out STD_LOGIC_VECTOR ( 24 downto 0 );
  absolute2 : out STD_LOGIC_VECTOR ( 23 downto 0 );
  envelop : out STD_LOGIC_VECTOR ( 47 downto 0 );
  sys_clock : in STD_LOGIC;
  reset : in STD_LOGIC
    );
end component;

-- Simulation signals
signal clk_sim 			: std_logic := '0';
signal reset            : std_logic := '1';
signal BFSK 			: std_logic_vector(7 downto 0);
signal FH 			: std_logic_vector(7 downto 0);
signal spreaded_signal 			: std_logic_vector(7 downto 0);
signal despreaded 			: std_logic_vector(7 downto 0);
signal IF_BPF               : std_logic_vector(23 downto 0);
signal x               : std_logic_vector(24 downto 0);
signal y               : std_logic_vector(24 downto 0); 
signal mag_out               : std_logic_vector(24 downto 0); 
signal absolute2        : STD_LOGIC_VECTOR ( 23 downto 0 );
signal envelop        : STD_LOGIC_VECTOR ( 47 downto 0 );


begin
	-- UART Transmitter port mapping
	dev_to_test:  FH_modem_wrapper
		port map(BFSK, FH, spreaded_signal, despreaded, IF_BPF, x, y, mag_out, absolute2,envelop, clk_sim, reset );
	
	-- Simulate the input clock to our design
	clk_proc : process
		begin
		wait for 5 ns;
		clk_sim <= not clk_sim;
	end process clk_proc;
	

	end test;

But when I run the simulation , I get on this error:

[VRFC 10-9] a_x already imposes an index constraint ["D:/Users/dell/Complex_Envelop_detector_15kHzIF_Fig2/modem/modem.ip_user_files/bd/FH_modem/ipshared/e786/sim/magnitude.vhd":23]

I had pointed to the line 23 by a comment before it.

How to solve this error?

I am looking forward your reply.
Thanks.

The error is because the magnitude as to be one bit longer than the inputs, (as the magnitude of (0xFFFFFF, 0xFFFFFF) is 0x16A09E4, which will overflow if you put it into a 25-bit signed value.

It will however fit nicely into a 25-bit unsigned value, and as it is a magnitude it will be positive. So maybe snip off the top bit in the assignment, but remember it is unsigned!

hamster · August 1, 2019

33 minutes ago, xc6lx45 said:

But this would an example where it's trivially easy to generate the reference tone in quadrature. Multiply with the complex-valued reference tone, lowpass-filter to suppress the shifted negative frequency component and there's my analytical ("one-sided spectrum") signal for polar processing.

Using a tool for what it is meant to do is easy. Using a tool for something where it isn't suited, that is where the learning begins!

(I now goes back to doing dental surgery with a steamroller, or maybe digging a tunnel with a teaspoon).

xc6lx45 · August 1, 2019

22 hours ago, hamster said:

Going to Incorporate it into my (MCU based) guitar tuner... but it is a nice tool to have in the kit.

But this would an example where it's trivially easy to generate the reference tone in quadrature. Multiply with the complex-valued reference tone, lowpass-filter to suppress the shifted negative frequency component and there's my analytical ("one-sided spectrum") signal for polar processing.

Now to be honest I've never ever designed a guitar tuner but I suspect that this with a decimating lowpass filter (no point in maintaining an output rate much higher than the filter bandwidth) can be orders of magnitude cheaper because I'm designing for the tuner's capture bandwidth (say, 10 % of the high E string fundamental would be ~30 Hz) instead of audio frequency.

Ahmed Alfadhel · August 1, 2019

Hi ,

Thank u @hamster for your elaboration about Hilbert Transform. I modified your code to work with my design , as follow:

--Hilbert Transformer
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity hilbert_transformer is
    Port ( clk      : in  STD_LOGIC;
           real_in  : in  STD_LOGIC_VECTOR (23 downto 0);
           real_out : out  STD_LOGIC_VECTOR (24 downto 0) := (others => '0');
           imag_out : out  STD_LOGIC_VECTOR (24 downto 0) := (others => '0'));
end hilbert_transformer;

architecture Behavioral of hilbert_transformer is
   -- Constants are 2/(n * pi) * 512, for n of -7,-5,-3,-1,1,3,5,7
   constant kernel0  : signed(real_in'length-1 downto 0) := to_signed( -47, real_in'length);
   constant kernel2  : signed(real_in'length-1 downto 0) := to_signed( -66, real_in'length);
   constant kernel4  : signed(real_in'length-1 downto 0) := to_signed(-109, real_in'length);
   constant kernel6  : signed(real_in'length-1 downto 0) := to_signed(-326, real_in'length);
   constant kernel8  : signed(real_in'length-1 downto 0) := to_signed( 326, real_in'length);
   constant kernel10 : signed(real_in'length-1 downto 0) := to_signed( 109, real_in'length);
   constant kernel12 : signed(real_in'length-1 downto 0) := to_signed(  66, real_in'length);
   constant kernel14 : signed(real_in'length-1 downto 0) := to_signed(  47, real_in'length);

   type a_delay is array (0 to 14) of signed(real_in'high downto 0);

   signal delay : a_delay := (others => (others => '0'));
   signal tap0  : signed(real_in'length+kernel0'length-1  downto 0) := (others => '0');
   signal tap2  : signed(real_in'length+kernel2'length-1  downto 0) := (others => '0');
   signal tap4  : signed(real_in'length+kernel4'length-1  downto 0) := (others => '0');
   signal tap6  : signed(real_in'length+kernel6'length-1  downto 0) := (others => '0');
   signal tap8  : signed(real_in'length+kernel8'length-1  downto 0) := (others => '0');
   signal tap10 : signed(real_in'length+kernel10'length-1 downto 0) := (others => '0');
   signal tap12 : signed(real_in'length+kernel12'length-1 downto 0) := (others => '0');
   signal tap14 : signed(real_in'length+kernel14'length-1 downto 0) := (others => '0');
   
begin

process(clk) 
   variable imag_tmp : signed(real_in'length*2-1 downto 0);
   begin
      if   rising_edge(clk) then 
         
         real_out <= std_logic_vector(resize(delay(8),real_out'length));  -- deliberatly advanced by one due to latency
         
         imag_tmp := tap0 + tap2  + tap4  + tap6 
                   + tap8 + tap10 + tap12 + tap14;
         imag_out <= std_logic_vector(imag_tmp(imag_tmp'high downto imag_tmp'high-imag_out'high));
         
         tap0  <= delay(0)  * kernel0;
         tap2  <= delay(2)  * kernel2;
         tap4  <= delay(4)  * kernel4;
         tap6  <= delay(6)  * kernel6;
         tap8  <= delay(8)  * kernel8;
         tap10 <= delay(10) * kernel10;
         tap12 <= delay(12) * kernel12;
         tap14 <= delay(14) * kernel14;
         
         -- Update the delay line 
         delay(1 to 14) <= delay(0 to 13) ;
         delay(0)       <= signed(real_in);
      end if;
   end process;
end Behavioral;

and

-- magnitude IP
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity magnitude is
    Port ( 
        clk           : in std_logic;
        x_in          : in std_logic_vector(24 downto 0);
        y_in          : in std_logic_vector(24 downto 0);
        x_out         : out std_logic_vector(24 downto 0) := (others => '0');
        y_out         : out std_logic_vector(24 downto 0) := (others => '0');
        magnitude_out : out std_logic_vector(24 downto 0) := (others => '0') -- Accurate to 5 bits or so
    );
end magnitude;

architecture Behavioral of magnitude is

    type a_x is array(0 to 5) of signed(x_in'high+1 downto 0);
    type a_y is array(0 to 5) of signed(y_in'high+1 downto 0);
    type a_x_delay is array(0 to 5) of std_logic_vector(x_in'high downto 0);
    type a_y_delay is array(0 to 5) of std_logic_vector(y_in'high downto 0);
   
-- line 23 (error occured here)
    signal x : a_x(24 downto 0) := (others => (others => '0'));
    signal y : a_y(24 downto 0) := (others => (others => '0'));
    signal x_delay : a_x_delay(24 downto 0) := (others => (others => '0'));
    signal y_delay : a_y_delay(24 downto 0) := (others => (others => '0'));
    
begin

    magnitude_out <= std_logic_vector(y(5));
    x_out <= x_delay(x_delay'high);
    y_out <= y_delay(y_delay'high);

process(clk)
    begin
        if rising_edge(clk) then
            if x(4) >= 0 then
                -- x(5) is not needed
                y(5) <= y(4) + x(4)(x(4)'high downto 4);
            else
                -- x(5) is not needed
                y(5) <= y(4) - x(4)(x(4)'high downto 4);
            end if;
            
            if x(3) >= 0 then
                x(4) <= x(3) - y(3)(y(3)'high downto 3);
                y(4) <= y(3) + x(3)(x(3)'high downto 3);
            else
                x(4) <= x(3) + y(3)(y(3)'high downto 3);
                y(4) <= y(3) - x(3)(x(3)'high downto 3);
            end if;
            
            if x(2) >= 0 then
                x(3) <= x(2) - y(2)(y(2)'high downto 2);
                y(3) <= y(2) + x(2)(x(2)'high downto 2);
            else
                x(3) <= x(2) + y(2)(y(2)'high downto 2);
                y(3) <= y(2) - x(2)(x(2)'high downto 2);
            end if;
            
            if x(1) >= 0 then
                x(2) <= x(1) - y(1)(y(1)'high downto 1);
                y(2) <= y(1) + x(1)(x(1)'high downto 1);
            else
                x(2) <= x(1) + y(1)(y(1)'high downto 1);
                y(2) <= y(1) - x(1)(x(1)'high downto 1);
            end if;
            
            if x(0) >= 0 then
                x(1) <= x(0) - y(0)(y(0)'high downto 0);
                y(1) <= y(0) + x(0)(x(0)'high downto 0);
            else
                x(1) <= x(0) + y(0)(y(0)'high downto 0);
                y(1) <= y(0) - x(0)(x(0)'high downto 0);
            end if;
            
            if y_in(y_in'high) = '1' then
                x(0) <= signed(x_in(x_in'high) & x_in);
                y(0) <= signed(to_signed(0,y_in'length+1)-signed(y_in));
            else
                x(0) <= signed(x_in(x_in'high) & x_in);
                y(0) <= signed(y_in(y_in'high) & y_in);
            end if;
            
            -- Delay to output the inputs, so they are aligned with the magnitudes
            x_delay(1 to 5) <= x_delay(0 to 4);
            y_delay(1 to 5) <= y_delay(0 to 4);
            x_delay(0) <= x_in;
            y_delay(0) <= y_in;
        end if;
    end process;
 


end Behavioral;

and I used my own test bench :

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.numeric_std.all;
use std.textio.all ;
use ieee.std_logic_textio.all ;

-- Entity
entity FHSS_TX_Test_Bench_sim is
end;

-- Architecture
architecture test of FHSS_TX_Test_Bench_sim is
-- Our UART Transmitter Design Instantiation
component FH_modem_wrapper
port (
  BFSK : out STD_LOGIC_VECTOR ( 7 downto 0 );
  FH : out STD_LOGIC_VECTOR ( 7 downto 0 );
  spreaded_signal : out STD_LOGIC_VECTOR ( 7 downto 0 );
  despreaded : out STD_LOGIC_VECTOR ( 7 downto 0 );
  IF_BPF : out STD_LOGIC_VECTOR (23 downto 0);
  x : out STD_LOGIC_VECTOR ( 24 downto 0 );
  y : out STD_LOGIC_VECTOR ( 24 downto 0 );
  mag_out : out STD_LOGIC_VECTOR ( 24 downto 0 );
  absolute2 : out STD_LOGIC_VECTOR ( 23 downto 0 );
  envelop : out STD_LOGIC_VECTOR ( 47 downto 0 );
  sys_clock : in STD_LOGIC;
  reset : in STD_LOGIC
    );
end component;

-- Simulation signals
signal clk_sim 			: std_logic := '0';
signal reset            : std_logic := '1';
signal BFSK 			: std_logic_vector(7 downto 0);
signal FH 			: std_logic_vector(7 downto 0);
signal spreaded_signal 			: std_logic_vector(7 downto 0);
signal despreaded 			: std_logic_vector(7 downto 0);
signal IF_BPF               : std_logic_vector(23 downto 0);
signal x               : std_logic_vector(24 downto 0);
signal y               : std_logic_vector(24 downto 0); 
signal mag_out               : std_logic_vector(24 downto 0); 
signal absolute2        : STD_LOGIC_VECTOR ( 23 downto 0 );
signal envelop        : STD_LOGIC_VECTOR ( 47 downto 0 );


begin
	-- UART Transmitter port mapping
	dev_to_test:  FH_modem_wrapper
		port map(BFSK, FH, spreaded_signal, despreaded, IF_BPF, x, y, mag_out, absolute2,envelop, clk_sim, reset );
	
	-- Simulate the input clock to our design
	clk_proc : process
		begin
		wait for 5 ns;
		clk_sim <= not clk_sim;
	end process clk_proc;
	

	end test;

But when I run the simulation , I get on this error:

[VRFC 10-9] a_x already imposes an index constraint ["D:/Users/dell/Complex_Envelop_detector_15kHzIF_Fig2/modem/modem.ip_user_files/bd/FH_modem/ipshared/e786/sim/magnitude.vhd":23]

I had pointed to the line 23 by a comment before it.

How to solve this error?

I am looking forward your reply.
Thanks.

hamster · August 1, 2019

Oh, a quick hack of a CORDIC magnitude

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity magnitude is
    Port ( 
        clk           : in std_logic;
        x_in          : in std_logic_vector;
        y_in          : in std_logic_vector;
        x_out         : out std_logic_vector := (others => '0');
        y_out         : out std_logic_vector := (others => '0');
        magnitude_out : out std_logic_vector := (others => '0') -- Accurate to 5 bits or so
    );
end magnitude;

architecture Behavioral of magnitude is

    type a_x is array(0 to 5) of signed(x_in'high+1 downto 0);
    type a_y is array(0 to 5) of signed(y_in'high+1 downto 0);
    type a_x_delay is array(0 to 5) of std_logic_vector(x_in'high downto 0);
    type a_y_delay is array(0 to 5) of std_logic_vector(y_in'high downto 0);
    
    signal x : a_x := (others => (others => '0'));
    signal y : a_y := (others => (others => '0'));
    signal x_delay : a_x_delay := (others => (others => '0'));
    signal y_delay : a_y_delay := (others => (others => '0'));
    
begin

    magnitude_out <= std_logic_vector(y(5));
    x_out <= x_delay(x_delay'high);
    y_out <= y_delay(y_delay'high);

process(clk)
    begin
        if rising_edge(clk) then
            if x(4) >= 0 then
                -- x(5) is not needed
                y(5) <= y(4) + x(4)(x(4)'high downto 4);
            else
                -- x(5) is not needed
                y(5) <= y(4) - x(4)(x(4)'high downto 4);
            end if;
            
            if x(3) >= 0 then
                x(4) <= x(3) - y(3)(y(3)'high downto 3);
                y(4) <= y(3) + x(3)(x(3)'high downto 3);
            else
                x(4) <= x(3) + y(3)(y(3)'high downto 3);
                y(4) <= y(3) - x(3)(x(3)'high downto 3);
            end if;
            
            if x(2) >= 0 then
                x(3) <= x(2) - y(2)(y(2)'high downto 2);
                y(3) <= y(2) + x(2)(x(2)'high downto 2);
            else
                x(3) <= x(2) + y(2)(y(2)'high downto 2);
                y(3) <= y(2) - x(2)(x(2)'high downto 2);
            end if;
            
            if x(1) >= 0 then
                x(2) <= x(1) - y(1)(y(1)'high downto 1);
                y(2) <= y(1) + x(1)(x(1)'high downto 1);
            else
                x(2) <= x(1) + y(1)(y(1)'high downto 1);
                y(2) <= y(1) - x(1)(x(1)'high downto 1);
            end if;
            
            if x(0) >= 0 then
                x(1) <= x(0) - y(0)(y(0)'high downto 0);
                y(1) <= y(0) + x(0)(x(0)'high downto 0);
            else
                x(1) <= x(0) + y(0)(y(0)'high downto 0);
                y(1) <= y(0) - x(0)(x(0)'high downto 0);
            end if;
            
            if y_in(y_in'high) = '1' then
                x(0) <= signed(x_in(x_in'high) & x_in);
                y(0) <= signed(to_signed(0,y_in'length+1)-signed(y_in));
            else
                x(0) <= signed(x_in(x_in'high) & x_in);
                y(0) <= signed(y_in(y_in'high) & y_in);
            end if;
            
            -- Delay to output the inputs, so they are aligned with the magnitudes
            x_delay(1 to 5) <= x_delay(0 to 4);
            y_delay(1 to 5) <= y_delay(0 to 4);
            x_delay(0) <= x_in;
            y_delay(0) <= y_in;
        end if;
    end process;
 end Behavioral;

Chaining the two together, and it seems to work. Top trace is the input, second trace is the delayed input,

Third is the delayed output of the Hilbert filter, and the last is the scaled magnitude of the complex x+iy signal.

NOTE: I know for sure that these are buggy, as they have range overflows), but they should give the idea of how @Ahmed Alfadhel could be implement it.

magnitude.vhd hilbert_transformer.vhd tb_hilbert_transformer.vhd

hamster · July 31, 2019

11 hours ago, D@n said:

@hamster,

Not bad, not bad at all ... just some feedback for you though:

The "official" Hilbert transform tap generation suffers from the same Gibbs phenomena that keeps folks from using the "ideal lowpass filter" (i.e. sin x/x)

You could "window" the filter to get better performance, or you could try using Parks-McClellan to get better taps. There are tricks to designing filters with quantized taps as well ... however the ones I know are ad-hoc and probably about the same as what you did above

There's symmetry in the filter. For half as many multiplies you can take sample differences, and then apply the multiplies to those sample differences.

Other than that, pretty cool! Did you find anything useful to test it on?

Dan

Going to Incorporate it into my (MCU based) guitar tuner... but it is a nice tool to have in the kit.

zygot · July 31, 2019

4 hours ago, hamster said:

I was intrigued enough by the Hilbert Transform to actually learn, experiment and implement it in VHDL

Oh, I think that you've started an investigation that will hard for you to stop

D@n · July 31, 2019

@hamster,

Not bad, not bad at all ... just some feedback for you though:

The "official" Hilbert transform tap generation suffers from the same Gibbs phenomena that keeps folks from using the "ideal lowpass filter" (i.e. sin x/x)
You could "window" the filter to get better performance, or you could try using Parks-McClellan to get better taps. There are tricks to designing filters with quantized taps as well ... however the ones I know are ad-hoc and probably about the same as what you did above
There's symmetry in the filter. For half as many multiplies you can take sample differences, and then apply the multiplies to those sample differences.

Other than that, pretty cool! Did you find anything useful to test it on?

Dan

xc6lx45 · July 29, 2019

On 7/25/2019 at 11:13 AM, hamster said:

it looks like you just need to apply a low-pass filter

yes, for an application with basic requirements, like receiver gain control this will probably work just fine (it's equivalent to an analog envelope detector). Now it needs a fairly high bandwidth margin between the modulation and the carrier, and that may make it problematic in more sophisticated DSP applications (say "polar" signal processing when I try to reconstruct the signal from the envelope) where the tolerable noise level is orders of magnitude lower.

hamster · July 28, 2019

Oh, for what it's worth I've been toying with the Hilbert Transform. Here is a example of it;

#include <math.h>
#include <stdio.h>

#define SAMPLES 1000
#define HALF_WIDTH 11  /* e.g. 11 filters from -11 to 11 */

float x[SAMPLES];

int main(int argc, char *argv[]) {
  int i;
  /* Build some test data */
  for(i = 0; i < SAMPLES; i++) {
     x[i] = cos(2*M_PI*i/10.3);
  }

  /* Now apply the Hilbert Transform and see what we get */
  /* It should be close to sin(2*M_PI*i/10.3) */
  for(i = HALF_WIDTH; i < SAMPLES-HALF_WIDTH-1; i++) {
    double h = 0;
    int j;

    /* Apply the kernel */
    for(j = 1; j <= HALF_WIDTH; j+=2)
      h += (x[i-j]-x[i+j]) * 2.0/(j*M_PI);

    /* Print result */
    printf("%8.5f, %8.5f\n", x[i], h);
  }
}

hamster · July 25, 2019

Oh having a look at the full signal chain, it looks like you just need to apply a low-pass filter on the absolute value of the signal. It might be just as simple as:

if sample < 0 then
   filter := filter - filter/64 - sample;
else
   filter := filter - filter/64 + sample;
end if;

With the value of "64" change depending on your sample rates and desired cutoff frequency. Or if your needs get very complex you might need to use a FIR low pass filter.

Run some sample data through it in Matlab or Excel (or heavens forbid, some C code) and see what happens.

hamster · July 25, 2019

Hi @Ahmed Alfadhel

I had the C code handy because I have been working on an atan2(y,x) implementation for FPGAs, and had been testing ideas.

I left it in C because I don't really know your requirements, but I wanted to give you a working algorithm, complete with proof that it does work, and so you can tinker with it, see how it works, and make use of it. Oh, and I must admit that it was also because I am also lazy ?

But seriously:

- I don't know if you use VHDL or Verilog, or some HLS tool

- I don't know if your inputs are 4 bits or 40 bits long,

- I don''t know if you need the answer to be within 10% or 0.0001%

- I don't know if it has to run at 40Mhz or 400Mhz

- I don't know if you have 1000s of cycles to process each sample, or just one.

- I don't even know if you need the algorithm at all!

But it has been written to be trivially converted to any HDL as it only uses bit shifts and addition/subtraction. But maybe more importantly you can then use it during any subsequent debugging to verify that you correctly implemented it.

For an example of how trivial it is to convert to HDL:

    if(x > 0) { x += -ty/8;  y +=  tx/8;}
    else      { x +=  ty/8;  y += -tx/8;}

could be implemented as

IF x(x'high) = '0' THEN
   x := x - resize(y(y'high downto 3), y'length);
   y := y + resize(x(x'high downto 3), x'length);
ELSE
   x := x + resize(y(y'high downto 3), y'length);
   y := y - resize(x(x'high downto 3), x'length);
END IF

My suggestion is that should you choose to use it, compile the C program, making the main() function a sort of test bench, and then work out exactly what you need to implement in your HDL., You will then spend very little time writing, debugging and improving the HDL because you will have a very clear idea of what you are implementing.

Ahmed Alfadhel · July 25, 2019

Hi @hamster, @D@n, @xc6lx45

Thank you for your replies.

I used two stage of modulation. As I mentioned above :

12 hours ago, Ahmed Alfadhel said:

I generated frequency hops (FH) signal then I used these hops to modulate a BFSK (Binary Frequency Shift Keying) signal .

This will lead to a modulated signal similar to AM signal , since the MFSK (frequency hops) are higher than the BFSK frequency.

Kindly, see the attached pictures. The second picture is the transmitted signal (modulated signal).

Any more notes, I am really appreciating them.

@hamster, I don't understand why you embedded C code here! Where I can use it?

Thanks.

D@n · July 25, 2019

@hamster,

The Hilbert transform is actually pretty simple. You can build it from a half band lowpass filter that's just shifted up in frequency so that it cuts off at 0Hz and Nyquist. Perhaps this core might give you some ideas, although ... if I recall correctly this implementation only calculates the imaginary part of the Hilbert transform. (The real part doesn't change, but it does need to be delayed so that the two match.

@Ahmed Alfadhel,

I agree with @hamster, you are asking the wrong question. The "envelope" of a signal is the amplitude of the function that gets multiplied by the carrier, such as the m(t) in the expression m(t)cos(2*pi*f_c *t). An FSK signal should not have any envelope to it at all, since all the information is contained in the frequency--something like the m(t) in cos(2*pi*(f_c + m(t))*t). Technically, the signal that results should all be at the same (complex) amplitude so ... something in your question doesn't make sense. Amplitude shift keying, phase shift keying, quadrature amplitude modulation, etc., those will all have non-constant envelopes to them, but they don't match the drawings in your second figure. Now if you apply a matched filter to your FSK signal, that might put a bit of an amplitude on the result ... but that's another story.

My guess is that either you aren't working with FSK, or there's something missing from your charts up above--the FM discriminator, but that's a longer discussion to have elsewhere. Just to make matters worse, some FM discriminators are susceptible to amplitude variations and ... that'd really mess up what you are trying to accomplish above.

Dan

hamster · July 24, 2019

Um, are you sure that you are asking the right question?

if the signal is BFSK, it should have pretty much a constant envelope, as only the frequency is changed?

hamster · July 24, 2019

Hi, Sorry to barge in, but if anybody can point me to the Hibbert Transformer info I would be very grateful.

However, here is an FPGA friendly way to calculate mag = sqrt(x*x+y*y), with about a 99% accuracy. You can easily see the pattern to get whatever accuracy you need.

#include <math.h>
#include <stdio.h>

#define M_SCALE (16)             /* Scaling for the magnitude calc */

 void cordic_mag(int x,int y, int  *mag) {
    int tx, ty;
    x *= M_SCALE;
    y *= M_SCALE;

    /* This step makes the CORDIC gain about 2 */
    if(y < 0) { x = -(x+x/4-x/32-x/256); y = -(y+y/4-y/32-y/256); }
    else      { x =  (x+x/4-x/32-x/256); y =  (y+y/4-y/32-y/256); }

    tx = x; ty = y;
    if(x > 0) { x += -ty/1;  y +=  tx/1;}
    else      { x +=  ty/1;  y += -tx/1;}

    tx = x; ty = y;
    if(x > 0) { x += -ty/2;  y +=  tx/2;}
    else      { x +=  ty/2;  y += -tx/2;}

    tx = x; ty = y;
    if(x > 0) { x += -ty/4;  y +=  tx/4;}
    else      { x +=  ty/4;  y += -tx/4;}

    tx = x; ty = y;
    if(x > 0) { x += -ty/8;  y +=  tx/8;}
    else      { x +=  ty/8;  y += -tx/8;}

    tx = x; ty = y;
    if(x > 0) { x += -ty/16;  y +=  tx/16;}
    else      { x +=  ty/16;  y += -tx/16;}

    *mag = ty/M_SCALE/2; /* the 2 is to remove the CORDIC gain */
 }

 int main(int argc, char *argv[]) {
    int i;
    int cases = 300;

    printf("Irput       Calculated      CORDIC  Error\n");
    for(i = 0; i < cases; i++) {
       float angle = 2*M_PI*i/cases;
       int x = sin(angle)*20000;
       int y = cos(angle)*20000;
       int mag, a_mag   = (int)sqrt(x*x+y*y);

       cordic_mag(x,y, &mag);

       printf("%6i %6i  = %6i  vs %6i   %4i\n",
               x, y, a_mag, mag, mag-a_mag);
    }
 }

Oh, here is the output with a couple more iterations added.

Irput       Calculated      CORDIC   Error
     0  20000  =  20000  vs  19999     -1
   418  19995  =  19999  vs  19995     -4
   837  19982  =  19999  vs  20001      2
  1255  19960  =  19999  vs  19998     -1
  1673  19929  =  19999  vs  19995     -4
  2090  19890  =  19999  vs  20001      2
  2506  19842  =  19999  vs  19998     -1
  2921  19785  =  19999  vs  19996     -3
  3335  19719  =  19999  vs  20001      2
  3747  19645  =  19999  vs  19998     -1
  4158  19562  =  19999  vs  19996     -3
  4567  19471  =  19999  vs  20001      2
  4973  19371  =  19999  vs  19997     -2
  5378  19263  =  19999  vs  19996     -3
  5780  19146  =  19999  vs  20001      2
  6180  19021  =  19999  vs  19998     -1
  6577  18887  =  19999  vs  19999      0
  6971  18745  =  19999  vs  20001      2
  7362  18595  =  19999  vs  19993     -6

xc6lx45 · July 24, 2019

Well yes and no. The question I'd ask is, can you use a local oscillator somewhere in your signal path with a 90 degree offset replica. In many cases this is trivially easy ("trivially" because I can e.g. divide digitally from double frequency or somewhat less trivially, use, say, a polyphase filter. In any way, it's probably easier on the LO than on the information signal because it's a single discrete frequency at a time, where the Hilbert transform approach needs to deal with the information signal bandwidth).

If so, downconvert with sine and cosine ("direct conversion") and the result will be just the same. After lowpass filtering, square, add, take square-root, there's your envelope . When throughput / cost matters (think "Envelope tracking" on cellphones) it is not uncommon to design RTL in square-of-envelope units to avoid the square root operation. Or if accuracy is not that critical, consider a nonlinear bit level approximation see "root of less evil, R. Lyons".

Of course, Hilbert transform is a viable alternative, just a FIR filter (if complex-valued).

In case you can't tell the answer right away, I recommend you do the experiment in the design tools what happens if you try to reach 0 Hz (hint, "Time-bandwidth product, Mr. Heisenberg". Eventually it boils down to fractional bandwidth and phase-shifting DC remains an unsolved problem...).

Enevlope Detection using FPGA board

Question

Link to comment

Share on other sites

20 answers to this question

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived