当前位置：首页 » 开发推广 » fft优化

fft优化

发布时间: 2021-03-22 02:39:03

⑴ 在DSP上实现FFT，每一级的输入怎么确定的

2009-02-05 19:43void FFT( COMPLEX *Y, int N) /* input sample array, number of points */
{
COMPLEX temp1,temp2; /*temporary storage variables */
int i,j,k; /*loop counter variables */
int upper_leg, lower_leg; /*index of upper/lower butterfly leg */
int leg_diff; /*difference between upper/lower leg */
int num_stages=0; /*number of FFT stages, or iterations */
int index, step; /*index and step between twiddle factor*/

/* log(base 2) of # of points = # of stages */
i=1;
do
{
num_stages+=1;
i = i *2 ;
} while (i!=N);

/* starting difference between upper and lower butterfly legs*/
leg_diff = N/2;

/* step between values in twiddle factor array twiddle.h */
step = 512 / N;

/* For N-point FFT */

for ( i = 0 ; i < num_stages ; i++ )
{
index = 0;

for ( j = 0; j < leg_diff ; j++ )
{

for ( upper_leg = j; upper_leg < N ; upper_leg += (2*leg_diff) )
{
lower_leg = upper_leg + leg_diff;
temp1.real=(Y[upper_leg]).real + (Y[lower_leg]).real;
temp1.imag=(Y[upper_leg]).imag + (Y[lower_leg]).imag;
temp2.real=(Y[upper_leg]).real - (Y[lower_leg]).real;
temp2.imag=(Y[upper_leg]).imag - (Y[lower_leg]).imag;

(Y[lower_leg]).real = ((long)temp2.real * (w[index]).real)/8192;
(Y[lower_leg]).real -= ((long)temp2.imag * (w[index]).imag)/8192;

(Y[lower_leg]).imag = ((long)temp2.real * (w[index]).imag)/8192;
(Y[lower_leg]).imag += ((long)temp2.imag * (w[index]).real)/8192;

(Y[upper_leg]).real = temp1.real;

(Y[upper_leg]).imag = temp1.imag;
}
index+=step;
}
leg_diff = leg_diff / 2;
step *= 2;
}

/* bit reversal for resequencing data */
j=0;
for ( i=1 ; i < (N-1) ; i++ )
{
k = N / 2;

while ( k <= j)
{
j = j - k;
k >>= 1;
}

j = j + k;

if ( i < j )
{
temp1.real = (Y[j]).real;
temp1.imag = (Y[j]).imag;
(Y[j]).real = (Y[i]).real;
(Y[j]).imag = (Y[i]).imag;
(Y[i]).real = temp1.real;
(Y[i]).imag = temp1.imag;
}
}
return;
}

⑵ 高分 xilinx fft ipcore 7.1 怎么使用新手

多看看DATASHEET就知道了，
祝好运

⑶ FFT的定点DSP怎样实现的

看你用什么公司的芯片了，一般厂家都会提供定点fft库，最好用厂家提供的，因为这种库都是经过优化的算法，速度快，一般是汇编实现的。实在没有库可以用这个：

/* fix_fft.c - Fixed-point in-place Fast Fourier Transform */
/*
All data are fixed-point short integers, in which -32768
to +32768 represent -1.0 to +1.0 respectively. Integer
arithmetic is used for speed, instead of the more natural
floating-point.

For the forward FFT (time -> freq), fixed scaling is
performed to prevent arithmetic overflow, and to map a 0dB
sine/cosine wave (i.e. amplitude = 32767) to two -6dB freq
coefficients. The return value is always 0.

For the inverse FFT (freq -> time), fixed scaling cannot be
done, as two 0dB coefficients would sum to a peak amplitude
of 64K, overflowing the 32k range of the fixed-point integers.
Thus, the fix_fft() routine performs variable scaling, and
returns a value which is the number of bits LEFT by which
the output must be shifted to get the actual amplitude
(i.e. if fix_fft() returns 3, each value of fr[] and fi[]
must be multiplied by 8 (2**3) for proper scaling.
Clearly, this cannot be done within fixed-point short
integers. In practice, if the result is to be used as a
filter, the scale_shift can usually be ignored, as the
result will be approximately correctly normalized as is.

Written by: Tom Roberts 11/8/89
Made portable: Malcolm Slaney 12/15/94 [email protected]
Enhanced: Dimitrios P. Bouras 14 Jun 2006 [email protected]
*/

#define N_WAVE 1024 /* full length of Sinewave[] */
#define LOG2_N_WAVE 10 /* log2(N_WAVE) */

/*
Henceforth "short" implies 16-bit word. If this is not
the case in your architecture, please replace "short"
with a type definition which *is* a 16-bit word.
*/

/*
Since we only use 3/4 of N_WAVE, we define only
this many samples, in order to conserve data space.
*/
short Sinewave[N_WAVE-N_WAVE/4] = {
0, 201, 402, 603, 804, 1005, 1206, 1406,
1607, 1808, 2009, 2209, 2410, 2610, 2811, 3011,
3211, 3411, 3611, 3811, 4011, 4210, 4409, 4608,
4807, 5006, 5205, 5403, 5601, 5799, 5997, 6195,
6392, 6589, 6786, 6982, 7179, 7375, 7571, 7766,
7961, 8156, 8351, 8545, 8739, 8932, 9126, 9319,
9511, 9703, 9895, 10087, 10278, 10469, 10659, 10849,
11038, 11227, 11416, 11604, 11792, 11980, 12166, 12353,
12539, 12724, 12909, 13094, 13278, 13462, 13645, 13827,
14009, 14191, 14372, 14552, 14732, 14911, 15090, 15268,
15446, 15623, 15799, 15975, 16150, 16325, 16499, 16672,
16845, 17017, 17189, 17360, 17530, 17699, 17868, 18036,
18204, 18371, 18537, 18702, 18867, 19031, 19194, 19357,
19519, 19680, 19840, 20000, 20159, 20317, 20474, 20631,
20787, 20942, 21096, 21249, 21402, 21554, 21705, 21855,
22004, 22153, 22301, 22448, 22594, 22739, 22883, 23027,
23169, 23311, 23452, 23592, 23731, 23869, 24006, 24143,
24278, 24413, 24546, 24679, 24811, 24942, 25072, 25201,
25329, 25456, 25582, 25707, 25831, 25954, 26077, 26198,
26318, 26437, 26556, 26673, 26789, 26905, 27019, 27132,
27244, 27355, 27466, 27575, 27683, 27790, 27896, 28001,
28105, 28208, 28309, 28410, 28510, 28608, 28706, 28802,
28897, 28992, 29085, 29177, 29268, 29358, 29446, 29534,
29621, 29706, 29790, 29873, 29955, 30036, 30116, 30195,
30272, 30349, 30424, 30498, 30571, 30643, 30713, 30783,
30851, 30918, 30984, 31049, 31113, 31175, 31236, 31297,
31356, 31413, 31470, 31525, 31580, 31633, 31684, 31735,
31785, 31833, 31880, 31926, 31970, 32014, 32056, 32097,
32137, 32176, 32213, 32249, 32284, 32318, 32350, 32382,
32412, 32441, 32468, 32495, 32520, 32544, 32567, 32588,
32609, 32628, 32646, 32662, 32678, 32692, 32705, 32717,
32727, 32736, 32744, 32751, 32757, 32761, 32764, 32766,
32767, 32766, 32764, 32761, 32757, 32751, 32744, 32736,
32727, 32717, 32705, 32692, 32678, 32662, 32646, 32628,
32609, 32588, 32567, 32544, 32520, 32495, 32468, 32441,
32412, 32382, 32350, 32318, 32284, 32249, 32213, 32176,
32137, 32097, 32056, 32014, 31970, 31926, 31880, 31833,
31785, 31735, 31684, 31633, 31580, 31525, 31470, 31413,
31356, 31297, 31236, 31175, 31113, 31049, 30984, 30918,
30851, 30783, 30713, 30643, 30571, 30498, 30424, 30349,
30272, 30195, 30116, 30036, 29955, 29873, 29790, 29706,
29621, 29534, 29446, 29358, 29268, 29177, 29085, 28992,
28897, 28802, 28706, 28608, 28510, 28410, 28309, 28208,
28105, 28001, 27896, 27790, 27683, 27575, 27466, 27355,
27244, 27132, 27019, 26905, 26789, 26673, 26556, 26437,
26318, 26198, 26077, 25954, 25831, 25707, 25582, 25456,
25329, 25201, 25072, 24942, 24811, 24679, 24546, 24413,
24278, 24143, 24006, 23869, 23731, 23592, 23452, 23311,
23169, 23027, 22883, 22739, 22594, 22448, 22301, 22153,
22004, 21855, 21705, 21554, 21402, 21249, 21096, 20942,
20787, 20631, 20474, 20317, 20159, 20000, 19840, 19680,
19519, 19357, 19194, 19031, 18867, 18702, 18537, 18371,
18204, 18036, 17868, 17699, 17530, 17360, 17189, 17017,
16845, 16672, 16499, 16325, 16150, 15975, 15799, 15623,
15446, 15268, 15090, 14911, 14732, 14552, 14372, 14191,
14009, 13827, 13645, 13462, 13278, 13094, 12909, 12724,
12539, 12353, 12166, 11980, 11792, 11604, 11416, 11227,
11038, 10849, 10659, 10469, 10278, 10087, 9895, 9703,
9511, 9319, 9126, 8932, 8739, 8545, 8351, 8156,
7961, 7766, 7571, 7375, 7179, 6982, 6786, 6589,
6392, 6195, 5997, 5799, 5601, 5403, 5205, 5006,
4807, 4608, 4409, 4210, 4011, 3811, 3611, 3411,
3211, 3011, 2811, 2610, 2410, 2209, 2009, 1808,
1607, 1406, 1206, 1005, 804, 603, 402, 201,
0, -201, -402, -603, -804, -1005, -1206, -1406,
-1607, -1808, -2009, -2209, -2410, -2610, -2811, -3011,
-3211, -3411, -3611, -3811, -4011, -4210, -4409, -4608,
-4807, -5006, -5205, -5403, -5601, -5799, -5997, -6195,
-6392, -6589, -6786, -6982, -7179, -7375, -7571, -7766,
-7961, -8156, -8351, -8545, -8739, -8932, -9126, -9319,
-9511, -9703, -9895, -10087, -10278, -10469, -10659, -10849,
-11038, -11227, -11416, -11604, -11792, -11980, -12166, -12353,
-12539, -12724, -12909, -13094, -13278, -13462, -13645, -13827,
-14009, -14191, -14372, -14552, -14732, -14911, -15090, -15268,
-15446, -15623, -15799, -15975, -16150, -16325, -16499, -16672,
-16845, -17017, -17189, -17360, -17530, -17699, -17868, -18036,
-18204, -18371, -18537, -18702, -18867, -19031, -19194, -19357,
-19519, -19680, -19840, -20000, -20159, -20317, -20474, -20631,
-20787, -20942, -21096, -21249, -21402, -21554, -21705, -21855,
-22004, -22153, -22301, -22448, -22594, -22739, -22883, -23027,
-23169, -23311, -23452, -23592, -23731, -23869, -24006, -24143,
-24278, -24413, -24546, -24679, -24811, -24942, -25072, -25201,
-25329, -25456, -25582, -25707, -25831, -25954, -26077, -26198,
-26318, -26437, -26556, -26673, -26789, -26905, -27019, -27132,
-27244, -27355, -27466, -27575, -27683, -27790, -27896, -28001,
-28105, -28208, -28309, -28410, -28510, -28608, -28706, -28802,
-28897, -28992, -29085, -29177, -29268, -29358, -29446, -29534,
-29621, -29706, -29790, -29873, -29955, -30036, -30116, -30195,
-30272, -30349, -30424, -30498, -30571, -30643, -30713, -30783,
-30851, -30918, -30984, -31049, -31113, -31175, -31236, -31297,
-31356, -31413, -31470, -31525, -31580, -31633, -31684, -31735,
-31785, -31833, -31880, -31926, -31970, -32014, -32056, -32097,
-32137, -32176, -32213, -32249, -32284, -32318, -32350, -32382,
-32412, -32441, -32468, -32495, -32520, -32544, -32567, -32588,
-32609, -32628, -32646, -32662, -32678, -32692, -32705, -32717,
-32727, -32736, -32744, -32751, -32757, -32761, -32764, -32766,
};

/*
FIX_MPY() - fixed-point multiplication & scaling.
Substitute inline assembly for hardware-specific
optimization suited to a particluar DSP processor.
Scaling ensures that result remains 16-bit.
*/
inline short FIX_MPY(short a, short b)
{
/* shift right one less bit (i.e. 15-1) */
int c = ((int)a * (int)b) >> 14;
/* last bit shifted out = rounding-bit */
b = c & 0x01;
/* last shift + rounding bit */
a = (c >> 1) + b;
return a;
}

/*
fix_fft() - perform forward/inverse fast Fourier transform.
fr[n],fi[n] are real and imaginary arrays, both INPUT AND
RESULT (in-place FFT), with 0 <= n < 2**m; set inverse to
0 for forward transform (FFT), or 1 for iFFT.
*/
int fix_fft(short fr[], short fi[], short m, short inverse)
{
int mr, nn, i, j, l, k, istep, n, scale, shift;
short qr, qi, tr, ti, wr, wi;

n = 1 << m;

/* max FFT size = N_WAVE */
if (n > N_WAVE)
return -1;

mr = 0;
nn = n - 1;
scale = 0;

/* decimation in time - re-order data */
for (m=1; m<=nn; ++m) {
l = n;
do {
l >>= 1;
} while (mr+l > nn);
mr = (mr & (l-1)) + l;

if (mr <= m)
continue;
tr = fr[m];
fr[m] = fr[mr];
fr[mr] = tr;
ti = fi[m];
fi[m] = fi[mr];
fi[mr] = ti;
}

l = 1;
k = LOG2_N_WAVE-1;
while (l < n) {
if (inverse) {
/* variable scaling, depending upon data */
shift = 0;
for (i=0; i<n; ++i) {
j = fr[i];
if (j < 0)
j = -j;
m = fi[i];
if (m < 0)
m = -m;
if (j > 16383 || m > 16383) {
shift = 1;
break;
}
}
if (shift)
++scale;
} else {
/*
fixed scaling, for proper normalization --
there will be log2(n) passes, so this results
in an overall factor of 1/n, distributed to
maximize arithmetic accuracy.
*/
shift = 1;
}
/*
it may not be obvious, but the shift will be
performed on each data point exactly once,
ring this pass.
*/
istep = l << 1;
for (m=0; m<l; ++m) {
j = m << k;
/* 0 <= j < N_WAVE/2 */
wr = Sinewave[j+N_WAVE/4];
wi = -Sinewave[j];
if (inverse)
wi = -wi;
if (shift) {
wr >>= 1;
wi >>= 1;
}
for (i=m; i<n; i+=istep) {
j = i + l;
tr = FIX_MPY(wr,fr[j]) - FIX_MPY(wi,fi[j]);
ti = FIX_MPY(wr,fi[j]) + FIX_MPY(wi,fr[j]);
qr = fr[i];
qi = fi[i];
if (shift) {
qr >>= 1;
qi >>= 1;
}
fr[j] = qr - tr;
fi[j] = qi - ti;
fr[i] = qr + tr;
fi[i] = qi + ti;
}
}
--k;
l = istep;
}
return scale;
}

/*
fix_fftr() - forward/inverse FFT on array of real numbers.
Real FFT/iFFT using half-size complex FFT by distributing
even/odd samples into real/imaginary arrays respectively.
In order to save data space (i.e. to avoid two arrays, one
for real, one for imaginary samples), we proceed in the
following two steps: a) samples are rearranged in the real
array so that all even samples are in places 0-(N/2-1) and
all imaginary samples in places (N/2)-(N-1), and b) fix_fft
is called with fr and fi pointing to index 0 and index N/2
respectively in the original array. The above guarantees
that fix_fft "sees" consecutive real samples as alternating
real and imaginary samples in the complex array.
*/
int fix_fftr(short f[], int m, int inverse)
{
int i, N = 1<<(m-1), scale = 0;
short tt, *fr=f, *fi=&f[N];

if (inverse)
scale = fix_fft(fi, fr, m-1, inverse);
for (i=1; i<N; i+=2) {
tt = f[N+i-1];
f[N+i-1] = f[i];
f[i] = tt;
}
if (! inverse)
scale = fix_fft(fi, fr, m-1, inverse);
return scale;
}

//-------------------------------------------------------------//
//-------------------------------------------------------------//
//-------------------------------------------------------------//
//-------------------------------------------------------------//
//-------------------------------------------------------------//
//-------------------------------------------------------------//

/*
Short test program to accompany fix_fft.c
*/

#define DEBUG 0

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#include "fix_fft.c"

#define N 128
#define log2N 7
#define AMPLITUDE 12288

#define MAX(a,b) (((a)>(b))?(a):(b))

int main()
{
int i, scale;
unsigned diff;
short x[N], fx[N], fy[N];

for (i=0; i<N; i++) {
fx[i] = fy[i] = x[i] = AMPLITUDE*cos(i*2*3.1415926535/N);
#if DEBUG
printf("%d %d\n", i, x[i]);
#endif
}

fix_fftr(fy, log2N, 0);
scale = fix_fftr(fy, log2N, 1);
#if DEBUG
puts("");
fprintf(stderr, "scale = %d\n", scale);
#endif

for (i=0,diff=0; i<N; i++) {
#if DEBUG
int sample;
sample = fy[i] << scale;
printf("%d %d\n", i, sample);
#endif
diff = MAX(abs(fx[i]-(fy[i] << scale)), diff);
}
fprintf(stderr, "max amplitude error = %g %%\n",
diff *100./AMPLITUDE);
return 0;
}

⑷ C语言 1024点快速傅里叶变换（FFT）程序，最好经过优化，执行速度快

void fft()
{
int nn,n1,n2,i,j,k,l,m,s,l1;
float ar[1024],ai[1024]; // 实部虚部
float a[2050];

float t1,t2,x,y;
float w1,w2,u1,u2,z;
float fsin[10]={0.000000,1.000000,0.707107,0.3826834,0.1950903,0.09801713,0.04906767,0.02454123,0.01227154,0.00613588,};// 优化
float fcos[10]={-1.000000,0.000000,0.7071068,0.9238796,0.9807853,0.99518472,0.99879545,0.9996988,0.9999247,0.9999812,};
nn=1024;
s=10;

n1=nn/2; n2=nn-1;
j=1;
for(i=1;i<=nn;i++)
{
a[2*i]=ar[i-1];
a[2*i+1]=ai[i-1];
}
for(l=1;l<n2;l++)
{
if(l<j)
{
t1=a[2*j];
t2=a[2*j+1];
a[2*j]=a[2*l];
a[2*j+1]=a[2*l+1];
a[2*l]=t1;
a[2*l+1]=t2;
}
k=n1;
while (k<j)
{
j=j-k;
k=k/2;
}
j=j+k;
}
for(i=1;i<=s;i++)
{
u1=1;
u2=0;
m=(1<<i);
k=m>>1;
w1=fcos[i-1];
w2=-fsin[i-1];
for(j=1;j<=k;j++)
{
for(l=j;l<nn;l=l+m)
{
l1=l+k;
t1=a[2*l1]*u1-a[2*l1+1]*u2;
t2=a[2*l1]*u2+a[2*l1+1]*u1;
a[2*l1]=a[2*l]-t1;
a[2*l1+1]=a[2*l+1]-t2;
a[2*l]=a[2*l]+t1;
a[2*l+1]=a[2*l+1]+t2;
}
z=u1*w1-u2*w2;
u2=u1*w2+u2*w1;
u1=z;
}
}
for(i=1;i<=nn/2;i++)
{
ar[i]=a[2*i+2]/nn;
ai[i]=-a[2*i+3]/nn;
a[i]=4*sqrt(ar[i]*ar[i]+ai[i]*ai[i]); // 幅值
}
}

⑸ DSP上实现FFT、IFFT

CCS应该带这两个库函数，可以直接调用的，而且是TI汇编优化过的，试着找找

⑹ 基于FFT的算法优化要C语言完整程序（利用旋转因子的性质），有的请留言，答谢！！！(有核心代码，望指教

实现(C描述)

#include <stdio.h>

#include <math.h>

#include <stdlib.h>

//#include "complex.h"

// --------------------------------------------------------------------------

#define N 8 //64

#define M 3 //6 //2^m=N

#define PI 3.1415926

// --------------------------------------------------------------------------

float twiddle[N/2] = {1.0, 0.707, 0.0, -0.707};

float x_r[N] = {1, 1, 1, 1, 0, 0, 0, 0};

float x_i[N]; //N=8

/*

float twiddle[N/2] = {1, 0.9951, 0.9808, 0.9570, 0.9239, 0.8820, 0.8317, 0.7733,

0.7075, 0.6349, 0.5561, 0.4721, 0.3835, 0.2912, 0.1961, 0.0991,

0.0000,-0.0991,-0.1961,-0.2912,-0.3835,-0.4721,-0.5561,-0.6349,

-0.7075,-0.7733, 0.8317,-0.8820,-0.9239,-0.9570,-0.9808,-0.9951}; //N=64

float x_r[N]={1,1,1,1,1,1,1,1,

1,1,1,1,1,1,1,1,

1,1,1,1,1,1,1,1,

1,1,1,1,1,1,1,1,

0,0,0,0,0,0,0,0,

0,0,0,0,0,0,0,0,

0,0,0,0,0,0,0,0,

0,0,0,0,0,0,0,0,};

float x_i[N];

*/

FILE *fp;

// ----------------------------------- func -----------------------------------

/**

* 初始化输出虚部

*/

static void fft_init( void )

{

int i;

for(i=0; i<N; i++) x_i[i] = 0.0;

}

/**

* 反转算法.将时域信号重新排序.

* 这个算法有改进的空间

*/

static void bitrev( void )

{

int p=1, q, i;

int bit_rev[ N ]; //

float xx_r[ N ]; //

bit_rev[ 0 ] = 0;

while( p < N )

{

for(q=0; q<p; q++)

{

bit_rev[ q ] = bit_rev[ q ] * 2;

bit_rev[ q + p ] = bit_rev[ q ] + 1;

}

p *= 2;

}

for(i=0; i<N; i++) xx_r[ i ] = x_r[ i ];

for(i=0; i<N; i++) x_r[i] = xx_r[ bit_rev[i] ];

}

/* ------------ add by sshc625 ------------ */

static void bitrev2( void )

{

return ;

}

/* */

void display( void )

{

printf("\n\n");

int i;

for(i=0; i<N; i++)

printf("%f\t%f\n", x_r[i], x_i[i]);

}

/**

*

*/

void fft1( void )

{ fp = fopen("log1.txt", "a+");

int L, i, b, j, p, k, tx1, tx2;

float TR, TI, temp; // 临时变量

float tw1, tw2;

/* 深M. 对层进行循环. L为当前层, 总层数为M. */

for(L=1; L<=M; L++)

{

fprintf(fp,"----------Layer=%d----------\n", L);

/* b的意义非常重大,b表示当前层的颗粒具有的输入样本点数 */

b = 1;

i = L - 1;

while(i > 0)

{

b *= 2;

i--;

}

// -------------- 是否外层对颗粒循环, 内层对样本点循环逻辑性更强一些呢! --------------

/*

* outter对参与DFT的样本点进行循环

* L=1, 循环了1次(4个颗粒, 每个颗粒2个样本点)

* L=2, 循环了2次(2个颗粒, 每个颗粒4个样本点)

* L=3, 循环了4次(1个颗粒, 每个颗粒8个样本点)

*/

for(j=0; j<b; j++)

{

/* 求旋转因子tw1 */

p = 1;

i = M - L; // M是为总层数, L为当前层.

while(i > 0)

{

p = p*2;

i--;

}

p = p * j;

tx1 = p % N;

tx2 = tx1 + 3*N/4;

tx2 = tx2 % N;

// tw1是cos部分, 实部; tw2是sin部分, 虚数部分.

tw1 = ( tx1>=N/2)? -twiddle[tx1-N/2] : twiddle[ tx1 ];

tw2 = ( tx2>=N/2)? -twiddle[tx2-(N/2)] : twiddle[tx2];

/*

* inner对颗粒进行循环

* L=1, 循环了4次(4个颗粒, 每个颗粒2个输入)

* L=2, 循环了2次(2个颗粒, 每个颗粒4个输入)

* L=3, 循环了1次(1个颗粒, 每个颗粒8个输入)

*/

for(k=j; k<N; k=k+2*b)

{

TR = x_r[k]; // TR就是A, x_r[k+b]就是B.

TI = x_i[k];

temp = x_r[k+b];

/*

* 如果复习一下 (a+j*b)(c+j*d)两个复数相乘后的实部虚部分别是什么

* 就能理解为什么会如下运算了, 只有在L=1时候输入才是实数, 之后层的

* 输入都是复数, 为了让所有的层的输入都是复数, 我们只好让L=1时候的

* 输入虚部为0

* x_i[k+b]*tw2是两个虚数相乘

*/

fprintf(fp, "tw1=%f, tw2=%f\n", tw1, tw2);

x_r[k] = TR + x_r[k+b]*tw1 + x_i[k+b]*tw2;

x_i[k] = TI - x_r[k+b]*tw2 + x_i[k+b]*tw1;

x_r[k+b] = TR - x_r[k+b]*tw1 - x_i[k+b]*tw2;

x_i[k+b] = TI + temp*tw2 - x_i[k+b]*tw1;

fprintf(fp, "k=%d, x_r[k]=%f, x_i[k]=%f\n", k, x_r[k], x_i[k]);

fprintf(fp, "k=%d, x_r[k]=%f, x_i[k]=%f\n", k+b, x_r[k+b], x_i[k+b]);

} //

} //

} //

}

/**

* ------------ add by sshc625 ------------

* 该实现的流程为

* for( Layer )

* for( Granule )

* for( Sample )

*

*

*

*

*/

void fft2( void )

{ fp = fopen("log2.txt", "a+");

int cur_layer, gr_num, i, k, p;

float tmp_real, tmp_imag, temp; // 临时变量, 记录实部

float tw1, tw2;// 旋转因子,tw1为旋转因子的实部cos部分, tw2为旋转因子的虚部sin部分.

int step; // 步进

int sample_num; // 颗粒的样本总数(各层不同, 因为各层颗粒的输入不同)

/* 对层循环 */

for(cur_layer=1; cur_layer<=M; cur_layer++)

{

/* 求当前层拥有多少个颗粒(gr_num) */

gr_num = 1;

i = M - cur_layer;

while(i > 0)

{

i--;

gr_num *= 2;

}

/* 每个颗粒的输入样本数N' */

sample_num = (int)pow(2, cur_layer);

/* 步进. 步进是N'/2 */

step = sample_num/2;

/* */

k = 0;

/* 对颗粒进行循环 */

for(i=0; i<gr_num; i++)

{

/*

* 对样本点进行循环, 注意上限和步进

*/

for(p=0; p<sample_num/2; p++)

{

// 旋转因子, 需要优化...

tw1 = cos(2*PI*p/pow(2, cur_layer));

tw2 = -sin(2*PI*p/pow(2, cur_layer));

tmp_real = x_r[k+p];

tmp_imag = x_i[k+p];

temp = x_r[k+p+step];

/*(tw1+jtw2)(x_r[k]+jx_i[k])

*

* real : tw1*x_r[k] - tw2*x_i[k]

* imag : tw1*x_i[k] + tw2*x_r[k]

* 我想不抽象出一个

* typedef struct {

* double real; // 实部

* double imag; // 虚部

* } complex; 以及针对complex的操作

* 来简化复数运算是否是因为效率上的考虑!

*/

/* 蝶形算法 */

x_r[k+p] = tmp_real + ( tw1*x_r[k+p+step] - tw2*x_i[k+p+step] );

x_i[k+p] = tmp_imag + ( tw2*x_r[k+p+step] + tw1*x_i[k+p+step] );

/* X[k] = A(k)+WB(k)

* X[k+N/2] = A(k)-WB(k) 的性质可以优化这里*/

// 旋转因子, 需要优化...

tw1 = cos(2*PI*(p+step)/pow(2, cur_layer));

tw2 = -sin(2*PI*(p+step)/pow(2, cur_layer));

x_r[k+p+step] = tmp_real + ( tw1*temp - tw2*x_i[k+p+step] );

x_i[k+p+step] = tmp_imag + ( tw2*temp + tw1*x_i[k+p+step] );

printf("k=%d, x_r[k]=%f, x_i[k]=%f\n", k+p, x_r[k+p], x_i[k+p]);

printf("k=%d, x_r[k]=%f, x_i[k]=%f\n", k+p+step, x_r[k+p+step], x_i[k+p+step]);

}

/* 开跳!:) */

k += 2*step;

}

}

}

/*

* 后记:

* 究竟是颗粒在外层循环还是样本输入在外层, 好象也差不多, 复杂度完全一样.

* 但以我资质愚钝花费了不少时间才弄明白这数十行代码.

* 从中我发现一个于我非常有帮助的教训, 很久以前我写过一部分算法, 其中绝大多数都是递归.

* 将数据量减少, 减少再减少, 用归纳的方式来找出数据量加大代码的规律

* 比如FFT

* 1. 先写死LayerI的代码; 然后再把LayerI的输出作为LayerII的输入, 又写死代码; ......

* 大约3层就可以统计出规律来. 这和递归也是一样, 先写死一两层, 自然就出来了!

* 2. 有的功能可以写伪代码, 不急于求出结果, 降低复杂性, 把逻辑结果定出来后再添加.

* 比如旋转因子就可以写死, 就写1.0. 流程出来后再写旋转因子.

* 寥寥数语, 我可真是流了不少汗! Happy!

*/

void dft( void )

{

int i, n, k, tx1, tx2;

float tw1,tw2;

float xx_r[N],xx_i[N];

/*

* clear any data in Real and Imaginary result arrays prior to DFT

*/

for(k=0; k<=N-1; k++)

xx_r[k] = xx_i[k] = x_i[k] = 0.0;

// caculate the DFT

for(k=0; k<=(N-1); k++)

{

for(n=0; n<=(N-1); n++)

{

tx1 = (n*k);

tx2 = tx1+(3*N)/4;

tx1 = tx1%(N);

tx2 = tx2%(N);

if(tx1 >= (N/2))

tw1 = -twiddle[tx1-(N/2)];

else

tw1 = twiddle[tx1];

if(tx2 >= (N/2))

tw2 = -twiddle[tx2-(N/2)];

else

tw2 = twiddle[tx2];

xx_r[k] = xx_r[k]+x_r[n]*tw1;

xx_i[k] = xx_i[k]+x_r[n]*tw2;

}

xx_i[k] = -xx_i[k];

}

// display

for(i=0; i<N; i++)

printf("%f\t%f\n", xx_r[i], xx_i[i]);

}

// ---------------------------------------------------------------------------

int main( void )

{

fft_init( );

bitrev( );

// bitrev2( );

//fft1( );

fft2( );

display( );

system( "pause" );

// dft();

return 1;

}

本文来自CSDN博客，转载请标明出处：http://blog.csdn.net/sshcx/archive/2007/06/14/1651616.aspx

⑺ 急求经过FFT优化的高精度乘法、位运算优化的快速幂，求问题2的N次方

水题！这题是我出的！
http://dev.luogu.org/problem/show?pid=2010

⑻ 举例说明dsp指令系统针对数字信号处理有哪些优化

dsp能计算fft，还有很多其他的功能

⑼ DFT DTFT FFT有啥区别

对于一般的周期信号可以用一系列（有限个或者无穷多了）正弦波的叠加来表示。这些正弦波的频率都是某一个特定频率的倍数如5hz、2*5hz、3*5hz……（其中的5hz叫基频）。这是傅立叶级数的思想。所以说周期信号的频率是离散的。
而且，对于周期信号有一个特点，信号的周期越长，信号的基频越小。
非周期信号可以看作周期无穷大的周期信号，那么它的基频就是无穷小，这样它的频率组成就编程了连续的了。求这个连续频率的谱线的过程就是傅立叶变换。包括这样几种：
DTFT（时间离散，频率连续）
DFT（时间和频率都离散，可在计算机中处理）
FFT（DFT的优化算法，计算量减少）

阅读全文

fft优化

与fft优化相关的阅读推荐