熟女肥臀白浆大屁股一区二区,亚洲av国产suv,无码aⅴ精品一区二区三区浪潮

ydfgic

稍有積蓄

論壇徽章:: 0

電梯直達

1樓 [收藏(0)] [報告]

發(fā)表于 2011-07-29 11:39 |只看該作者 |倒序瀏覽

本帖最后由 ydfgic 于 2011-08-01 13:58 編輯

總結一下：
最后修改了一下測試代碼，比較方便的設置線程數(shù)，選擇測試對象，設置delay的參數(shù)。
經(jīng)過測試發(fā)現(xiàn)我的實現(xiàn)測試的結果時間非常穩(wěn)定，2個，10個，20個乃至40個線程的差別很小，幾乎達到了與線程數(shù)無關，同樣mutex也是和線程數(shù)關系不大，但是pthread_spinlock_t對線程數(shù)敏感，線程多的情況下，效率會降低很多。
同樣增加加鎖的粒度，對測試結果也有影響，當粒度很小的情況下，我的實現(xiàn)是mutex的4-5倍快，但是粒度很大的情況，比如我設置到delay的循環(huán)次數(shù)為1000時，效率是mutex的兩倍多快，但是cpu更忙些。spinlock的速度就不提了，很低。
結論：
如果是比較大粒度的加鎖肯定是mutex首選，雖然性能中庸，但是它因為會休眠掛起，不占用cpu，對系統(tǒng)影響小。
如果是比較細粒度的加鎖可以用我實現(xiàn)的lock，對線程數(shù)量幾乎無關，效率極高，可能是因為實現(xiàn)簡單，最大效率的做到了切換cpu，保證一個線程執(zhí)行，減少了多余環(huán)節(jié)。
pthread_spinlock_t的局限性太大，如果線程多的情況下，會造成性能的很大程度的損失。同時還僅限于小粒度的加鎖情況。

給出測試數(shù)據(jù)：
1）20線程，0delay
我的：time ./myspinlock_O3.out 20 0 0
real 0m0.323s
user 0m0.364s
sys    0m0.276s

mutex：time ./myspinlock_O3.out 20 1 0
real 0m1.634s
user 0m1.972s
sys    0m1.264s

spinlock: time ./myspinlock_O3.out 20 2 0
real 0m6.259s
user 0m12.477s
sys    0m0.004s

2）20線程，100 delay
我的：time ./myspinlock_O3.out 20 0 100
real 0m2.965s
user 0m3.268s
sys    0m2.636s

mutex：time ./myspinlock_O3.out 20 1 100
real 0m6.493s
user 0m6.344s
sys    0m6.604s

spinlock： time ./myspinlock_O3.out 20 2 100
real 0m15.760s
user 0m31.378s
sys    0m0.004s

3）10線程，0delay
我的：time ./myspinlock_O3.out 10 0 0
real 0m0.318s
user 0m0.372s
sys    0m0.248s
mutex：time ./myspinlock_O3.out 10 1 0
real 0m1.511s
user 0m1.808s
sys    0m1.200s
spinlock：time ./myspinlock_O3.out 10 2 0
real 0m3.625s
user 0m7.224s
sys    0m0.004s

4）2線程，0delay
我的：time ./myspinlock_O3.out 2 0 0
real 0m0.323s
user 0m0.376s
sys    0m0.184s
mutex：time ./myspinlock_O3.out 2 1 0
real 0m1.453s
user 0m1.688s
sys    0m1.136s
spinlock：time ./myspinlock_O3.out 2 2 0
real 0m0.819s
user 0m1.624s
sys    0m0.004s

最終版的實現(xiàn)

#include<stdint.h>
#include<unistd.h>
typedef volatile uint32_t spinlock_t;
#define MY_SPINLOCK_INITIALIZER 0
#define spinlock_lock(lock) do{ \
while(!__sync_bool_compare_and_swap(lock, 0, 1)) \
sched_yield(); \
}while(0)
#define spinlock_unlock(lock) do{ \
*lock = 0; \
}while(0)

復制代碼

最終版的測試代碼

#include"myspinlock.h"
// gcc -Wall -g -O3 -o myspinlock.out myspinlock.c -lpthread
///////////////////////// test
my_spinlock_t lock = MY_SPINLOCK_INITIALIZER;
volatile int cnt = 0;
#include<pthread.h>
#include<stdio.h>
#include <stdlib.h>
#define TOTAL 1000000 * 20
int NR;
int DELAY_CNT = 100;
void * fun1(void * arg)
{
int i = 0, id = *(int*)arg;
printf("thread:%d\n",id);
for(; i < NR; i++)
{
spinlock_lock(&lock);
cnt++;
int j = 0;
for (; j < DELAY_CNT; j++) {
*foo = (*foo * 33) + 17;
}
spinlock_unlock(&lock);
}
printf("thread:%d over, lock:%d\n",id, lock);
return 0;
}
pthread_mutex_t mlock = PTHREAD_MUTEX_INITIALIZER;
void * fun2(void * arg)
{
int i = 0, id = *(int*)arg;
printf("thread:%d\n",id);
for(; i < NR; i++)
{
pthread_mutex_lock(&mlock);
cnt++;
int j = 0;
for (; j < DELAY_CNT; j++) {
*foo = (*foo * 33) + 17;
}
pthread_mutex_unlock(&mlock);
}
printf("thread:%d over, lock:%d\n",id, lock);
return 0;
}
pthread_spinlock_t splock;
void * fun3(void * arg)
{
int i = 0, id = *(int*)arg;
printf("thread:%d\n",id);
for(; i < NR; i++)
{
pthread_spin_lock(&splock);
cnt++;
int j = 0;
for (; j < DELAY_CNT; j++) {
*foo = (*foo * 33) + 17;
}
pthread_spin_unlock(&splock);
}
printf("thread:%d over, lock:%d\n",id, lock);
return 0;
}
int N = 20;
int main(int c, char * s[])
{
int which = 0;
if(c > 1)
{
//線程數(shù)
N = atoi(s[1]);
if(N > 20 || N <= 1) N = 10;
}
if(c > 2)
{
//which func?
which = atoi(s[2]);
if(which > 2 || which < 0) which = 0;
}
if(c > 3)
{
//delay param
DELAY_CNT = atoi(s[3]);
if(DELAY_CNT > 10000 || DELAY_CNT < 0) DELAY_CNT= 100;
}
pthread_t id[N];
int args[N];
int i = 0;
void * (*fun[])(void*) = { fun1,fun2,fun3};
pthread_spin_init(&splock,0);
NR = TOTAL / N;
for(;i<N;++i){
args[i] = i;
pthread_create(&id[i],NULL,fun[which],&args[i]);
}
for(i=0;i<N;++i){
printf("join thread:%d\n", i);
pthread_join(id[i],NULL);
printf("join thread:%d done\n", i);
}
printf("cnt = %d, should be %d\n",cnt, N * NR);
return 0;
}

復制代碼

===============================================
先前的更新僅僅做為參考

更新
重新修改了我的實現(xiàn)，加入了放棄時間片的情況，測試結果，幾乎是mutex的2-3倍效率
real 0m0.431s
user 0m0.604s
sys 0m0.240s

我想這個應該就會是我理想中的最終版本了，起碼可以拋棄 pthread 庫的mutex實現(xiàn)一些簡單的加鎖的功能。

代碼：

#ifndef MY_SPINLOCK_H
#define MY_SPINLOCK_H
#include<stdint.h>
#include<unistd.h>
typedef volatile uint32_t my_spinlock_t;
#define MY_SPINLOCK_INITIALIZER 0
#define DELAY_NR 10000
static uint32_t bar = 13;
static uint32_t *foo = &bar;
#define do_hash(a) do{ \
(a) = ((a)+0x7ed55d16) + ((a)<<12); \
(a) = ((a)^0xc761c23c) ^ ((a)>>19); \
(a) = ((a)+0x165667b1) + ((a)<<5); \
(a) = ((a)+0xd3a2646c) ^ ((a)<<9); \
(a) = ((a)+0xfd7046c5) + ((a)<<3); \
(a) = ((a)^0xb55a4f09) ^ ((a)>>16); \
}while(0)
#define my_spinlock_lock(lock) do{ \
while(!__sync_bool_compare_and_swap(lock, 0, 1)) \
{ \
while(*lock) \
{ \
do_hash(*foo); \
if((*foo % 11) == 1) \
sched_yield(); \
} \
} \
}while(0)
#define my_spinlock_unlock(lock) do{ \
*lock = 0; \
}while(0)
#endif

復制代碼

=======================================

最近在研究原子操作，按網(wǎng)上一些資料實現(xiàn)了個自旋鎖
拿來和 posix 的mutex，spinlock 一起測，結果出乎我意料。
mutex的成績非常好，我自己實現(xiàn)的稍微差點，posix 的pthread_spinlock_t的結果比較差。
這個真沒想到，mutex的效率這么高，看到這個結果我都覺得不相信自己的眼睛了
還是印證了，不要靠自己感覺，實際數(shù)據(jù)才是最真實的。

誰能解釋一下，謝謝~

環(huán)境：
uname -a
Linux bsd02 2.6.35.9 #1 SMP Tue Jan 11 02:09:50 EST 2011 x86_64 GNU/Linux
雙核 Pentium(R) Dual-Core  CPU    E5400  @ 2.70GHz

并發(fā)20個線程測試，結果:
我的實現(xiàn)：
real 0m1.659s
user 0m3.276s
sys    0m0.000s

mutex：
real 0m1.481s
user 0m1.164s
sys    0m1.764s

pthread spinlock:
real 0m6.171s
user 0m12.301s
sys    0m0.004s

文庫|博客

使用正則表達式與lex實現(xiàn)詞法分析器
C語言的MIPS匯編實現(xiàn)（四）SWITCH
Requested init /linuxrc failed (error -2).
比較 csv 文件中數(shù)據(jù)差異
LMD ElPack v2019.7新版亮點：Transparent mode全新升級|附下載

int-main

家境小康

論壇徽章:: 0

2樓 [報告]

發(fā)表于 2011-07-29 11:43 |只看該作者

spinlock有這么差嗎？是不是沒用對

實戰(zhàn)分享：從技術角度談機器學習入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關系型數(shù)據(jù)庫RadonDB知多少？

ydfgic

稍有積蓄

論壇徽章:: 0

3樓 [報告]

發(fā)表于 2011-07-29 11:45 |只看該作者

本帖最后由 ydfgic 于 2011-08-01 10:45 編輯

這是最原始版本的代碼，最終代碼1樓以更新
======
附上代碼:

#ifndef MY_SPINLOCK_H
#define MY_SPINLOCK_H
#include<stdint.h>
typedef volatile uint32_t my_spinlock_t;
#define MY_SPINLOCK_INITIALIZER 0
#define DELAY_NR 10000
static uint32_t bar = 13;
static uint32_t *foo = &bar;
#define my_spinlock_lock(lock) do{ \
int i; \
while(!__sync_bool_compare_and_swap(lock, 0, 1)) \
{ \
i = 0; \
while(i++ < DELAY_NR) \
*foo = (*foo * 33) + 17; \
} \
}while(0)
#define my_spinlock_unlock(lock) do{ \
*lock = 0; \
}while(0)
#endif

復制代碼

#include"myspinlock.h"
my_spinlock_t lock = MY_SPINLOCK_INITIALIZER;
volatile int cnt = 0;
#include<pthread.h>
#include<stdio.h>
// gcc -Wall -g -o myspinlock.out myspinlock.c -lpthread
#define NR 1000000
void * fun1(void * arg)
{
int i = 0, id = *(int*)arg;
printf("thread:%d\n",id);
for(; i < NR; i++)
{
my_spinlock_lock(&lock);
cnt++;
my_spinlock_unlock(&lock);
}
printf("thread:%d over, lock:%d\n",id, lock);
return 0;
}
pthread_mutex_t mlock = PTHREAD_MUTEX_INITIALIZER;
void * fun2(void * arg)
{
int i = 0, id = *(int*)arg;
printf("thread:%d\n",id);
for(; i < NR; i++)
{
pthread_mutex_lock(&mlock);
cnt++;
pthread_mutex_unlock(&mlock);
}
printf("thread:%d over, lock:%d\n",id, lock);
return 0;
}
pthread_spinlock_t splock;
void * fun3(void * arg)
{
int i = 0, id = *(int*)arg;
printf("thread:%d\n",id);
for(; i < NR; i++)
{
pthread_spin_lock(&splock);
cnt++;
pthread_spin_unlock(&splock);
}
printf("thread:%d over, lock:%d\n",id, lock);
return 0;
}
int N = 20;
int main(int c, char * s[])
{
pthread_t id[N];
int args[N];
int i = 0;
void * (*fun[])(void*) = { fun1,fun2,fun3};
if(--c > 2) c = 0;
printf("c:%d\n",c);
pthread_spin_init(&splock,0);
for(;i<N;++i){
args[i] = i;
pthread_create(&id[i],NULL,fun[c],&args[i]);
}
for(i=0;i<N;++i){
printf("join thread:%d\n", i);
pthread_join(id[i],NULL);
printf("join thread:%d done\n", i);
}
printf("%d\n",cnt);
return 0;
}

復制代碼

實戰(zhàn)分享：從技術角度談機器學習入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關系型數(shù)據(jù)庫RadonDB知多少？

ydfgic

稍有積蓄

論壇徽章:: 0

4樓 [報告]

發(fā)表于 2011-07-29 11:47 |只看該作者

回復 2# int-main
我貼代碼了。
spinlock真的很差，這是個奇跡，它就是發(fā)生了。

實戰(zhàn)分享：從技術角度談機器學習入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關系型數(shù)據(jù)庫RadonDB知多少？

deadlylight

稍有積蓄

論壇徽章:: 0

5樓 [報告]

發(fā)表于 2011-07-29 11:58 |只看該作者

最關鍵的__sync_bool_compare_and_swap是什么

你這是每次都lock成功的例子，測試結果沒代表性

實戰(zhàn)分享：從技術角度談機器學習入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關系型數(shù)據(jù)庫RadonDB知多少？

群雄逐鹿中原

家境小康

論壇徽章:: 1

6樓 [報告]

發(fā)表于 2011-07-29 13:05 |只看該作者

spin lock，翻譯自選鎖不好，要翻譯成 “忙等鎖”，
忙等，必然用在非常變態(tài)的場合。

實戰(zhàn)分享：從技術角度談機器學習入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關系型數(shù)據(jù)庫RadonDB知多少？

ydfgic

稍有積蓄

論壇徽章:: 0

7樓 [報告]

發(fā)表于 2011-07-29 13:12 |只看該作者

最關鍵的__sync_bool_compare_and_swap是什么

你這是每次都lock成功的例子，測試結果沒代表性
deadlylight 發(fā)表于 2011-07-29 11:58

__sync_bool_compare_and_swap 是 GCC 內建的原子操作函數(shù)，執(zhí)行CAS 操作，也就是比較如果相等就swap，并且返回true，否則返回false。所以失敗的線程都會進入while循環(huán)里去，忙等。

還有個你注意到?jīng)]，posix 的spinlock效率還低下些。