平臺(tái) 論壇博客文庫(kù)

› 論壇 › 程序設(shè)計(jì) › C/C++ › 關(guān)于寫時(shí)復(fù)制(COW)

關(guān)于寫時(shí)復(fù)制(COW) [復(fù)制鏈接]

yuxh

家境小康

論壇徽章:: 1

電梯直達(dá)

1樓 [收藏(0)] [報(bào)告]

發(fā)表于 2006-09-26 15:39 |只看該作者 |倒序?yàn)g覽

大部份的STL在實(shí)現(xiàn)string時(shí)，都采用COW來保證其高效性。即多個(gè)類會(huì)共用一個(gè)數(shù)據(jù)緩沖區(qū)(buffer)，在拷貝構(gòu)造、賦值等操作時(shí)，并不會(huì)對(duì)buffer進(jìn)行復(fù)制。僅在需要對(duì)buffer進(jìn)行修改，而且此buffer已與別的類共享了，才會(huì)開辟空間，將buffer復(fù)制一份進(jìn)行修改。同樣在析構(gòu)時(shí)，如果buffer與與別的類共享，也不會(huì)釋放空間。
舉個(gè)例子：

#include <stdio.h>
#include <string>
using namespace std;
int main()
{
string test1 = "hello";
string test2(test1);
printf("test1:%p test2:%p\n", test1.c_str(), test2.c_str());
}

復(fù)制代碼

運(yùn)行結(jié)果：

test1:0x90a9014 test2:0x90a9014

可見兩個(gè)地址是相等的，它們共用了同一個(gè)緩沖區(qū)。
什么時(shí)候會(huì)引起數(shù)據(jù)區(qū)的復(fù)制？當(dāng)然是要修改string的值的時(shí)候

#include <stdio.h>
#include <string>
using namespace std;
int main()
{
string test1 = "hello";
string test2(test1);
printf("test1:%p test2:%p\n", test1.c_str(), test2.c_str());
test2[0] = 'w';
printf("test1:%p test2:%p\n", test1.c_str(), test2.c_str());
}

復(fù)制代碼

運(yùn)行結(jié)果：

test1:0x9e85014 test2:0x9e85014
test1:0x9e85014 test2:0x9e8502c

可以看到test2發(fā)生了變化。
再進(jìn)一步，編譯如何確定程序要對(duì)buffer進(jìn)行修改，從而去開辟新的空間呢？
程序一般是通過[]運(yùn)算符、iterator去訪問并修改數(shù)據(jù)。很自然地認(rèn)為，對(duì)于左值會(huì)引起數(shù)據(jù)復(fù)制，而右值不會(huì)。但實(shí)際上，編譯沒這么做�？赡苁亲笾祷蛴抑档呐卸ú]有那么簡(jiǎn)單吧？

#include <stdio.h>
#include <string>
using namespace std;
int main()
{
string test1 = "hello";
string test2(test1);
printf("test1:%p test2:%p\n", test1.c_str(), test2.c_str());
printf("test1:%p test2:%p\n", &test1[0], &test2[0]);
}

復(fù)制代碼

運(yùn)行結(jié)果：

test1:0x8a4a014 test2:0x8a4a014
test1:0x8a4a014 test2:0x8a4a02c

test2發(fā)生了變化。
看一下源碼：

const_reference
operator[] (size_type __pos) const
{
_GLIBCXX_DEBUG_ASSERT(__pos <= size());
return _M_data()[__pos];
}
reference
operator[](size_type __pos)
{
_GLIBCXX_DEBUG_ASSERT(__pos < size());
_M_leak();
return _M_data()[__pos];
}

復(fù)制代碼

也就是說判定是否可能有寫操作是與類的類型相關(guān)的，如果是const string，則不復(fù)制，如果是string，則一定復(fù)制
再看看這個(gè)：

#include <stdio.h>
#include <string>
using namespace std;
int main()
{
string test1 = "hello";
string test2(test1);
printf("test1:%p test2:%p\n", test1.c_str(), test2.c_str());
const string &test3 = test1;
const string &test4 = test2;
printf("test1:%p test2:%p\n", &test3[0], &test4[0]);
}

復(fù)制代碼

結(jié)果就是：

test1:0x8c62014 test2:0x8c62014
test1:0x8c62014 test2:0x8c62014

當(dāng)然這樣寫很難受，憑什么要搞兩個(gè)const的引用出來�。�
這樣就比較自然：

#include <stdio.h>
#include <string>
using namespace std;
void proc(const string& test1, const string& test2)
{
printf("test1:%p test2:%p\n", &test1[0], &test2[0]);
}
int main()
{
string test1 = "hello";
string test2(test1);
printf("test1:%p test2:%p\n", test1.c_str(), test2.c_str());
proc(test1, test2);
}

復(fù)制代碼

也是說一定要嚴(yán)格地確定數(shù)據(jù)類型是否是const的，如果函數(shù)里不修改修，則傳const，良好的習(xí)慣有利于代碼質(zhì)量的提高。
string和char *是無法共享數(shù)據(jù)區(qū)的，所以用c++就盡量少用指針，兩種風(fēng)格合在一起，效率是最低的。

[ 本帖最后由 yuxh 于 2006-9-26 15:45 編輯 ]

文庫(kù)|博客

使用正則表達(dá)式與lex實(shí)現(xiàn)詞法分析器
C語言的MIPS匯編實(shí)現(xiàn)（四）SWITCH
Requested init /linuxrc failed (error -2).
比較 csv 文件中數(shù)據(jù)差異
LMD ElPack v2019.7新版亮點(diǎn)：Transparent mode全新升級(jí)|附下載

yuxh

家境小康

論壇徽章:: 1

2樓 [報(bào)告]

發(fā)表于 2006-09-26 15:42 |只看該作者

年紀(jì)越來越老，是不是越來越啰嗦了呢？
但一般對(duì)于vector或者list是沒有COW的，要拷貝就全拷。
但可以自己封裝，隨便寫了一個(gè)，還不是很完善：

#ifndef _COW_CONTAINER_
#define _COW_CONTAINER_ 1
#include <vector>
#include <list>
using namespace std;
template<typename _Tp>
class cow_container
{
public:
typedef typename _Tp::value_type value_type;
typedef typename _Tp::reference reference;
typedef typename _Tp::const_reference const_reference;
typedef typename _Tp::iterator iterator;
typedef typename _Tp::const_iterator const_iterator;
cow_container()
{
m_pCowNode = new cow_node;
m_pCowNode->m_refCount = 0;
}
cow_container(const cow_container& __cc)
{
m_pCowNode = __cc.m_pCowNode;
m_pCowNode->m_refCount++;
}
cow_container(const_iterator _begin, const_iterator _end)
{
m_pCowNode = new cow_node;
m_pCowNode->m_refCount = 0;
const_iterator itr;
for(itr = _begin(); itr != _end; ++itr) {
m_pCowNode->m_data.push_back(*itr);
}
}
cow_container(const_iterator _begin, size_t _n)
{
m_pCowNode = new cow_node;
m_pCowNode->m_refCount = 0;
const_iterator itr;
for(itr = _begin(); itr < _begin + _n; ++itr) {
m_pCowNode->m_data.push_back(*itr);
}
}
cow_container(const _Tp& __cc)
{
m_pCowNode = new cow_node;
m_pCowNode->m_refCount = 0;
m_pCowNode->m_data = __cc;
}
~cow_container()
{
if(m_pCowNode->m_refCount == 0)
delete m_pCowNode;
else
m_pCowNode->m_refCount--;
}
cow_container &operator=(const cow_container& __cc)
{
if(m_pCowNode != __cc.m_pCowNode) {
if(m_pCowNode->m_refCount == 0)
delete m_pCowNode;
else
m_pCowNode->m_refCount--;
m_pCowNode = __cc.m_pCowNode;
m_pCowNode->m_refCount++;
}
return *this;
}
cow_container &operator=(const _Tp& __cc)
{
if(m_pCowNode != __cc.m_pCowNode) {
if(m_pCowNode->m_refCount == 0)
delete m_pCowNode;
else
m_pCowNode->m_refCount--;
m_pCowNode = new cow_node;
m_pCowNode->m_refCount = 0;
m_pCowNode->m_data = __cc;
}
return *this;
}
const_iterator begin() const
{
return m_pCowNode->m_data.begin();
}
const_iterator end() const
{
return m_pCowNode->m_data.end();
}
iterator begin()
{
do_copy();
return m_pCowNode->m_data.begin();
}
iterator end()
{
do_copy();
return m_pCowNode->m_data.end();
}
const_reference operator[](int _n) const
{
return m_pCowNode->m_data[_n];
}
reference operator[](int _n)
{
do_copy();
return m_pCowNode->m_data[_n];
}
void push_back(const value_type& _val)
{
do_copy();
m_pCowNode->m_data.push_back(_val);
}
iterator insert(iterator __position, const value_type& __x)
{
do_copy();
m_pCowNode->m_data.insert(__position, __x);
}
const _Tp& container() const
{
return m_pCowNode->m_data;
}
private:
struct cow_node
{
int m_refCount;
_Tp m_data;
};
cow_node *m_pCowNode;
void do_copy()
{
if(m_pCowNode->m_refCount > 0) {
const _Tp &bak = m_pCowNode->m_data;
m_pCowNode->m_refCount--;
m_pCowNode = new cow_node;
m_pCowNode->m_refCount = 0;
m_pCowNode->m_data = bak;
}
}
};
template<class _Tp>
class VECTOR:public cow_container< vector<_Tp> > { };
template<class _Tp>
class LIST:public cow_container< list<_Tp> > { };
#endif

復(fù)制代碼

[ 本帖最后由 yuxh 于 2006-9-26 15:48 編輯 ]

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

Alligator27

豐衣足食

論壇徽章:: 0

3樓 [報(bào)告]

發(fā)表于 2006-09-26 20:30 |只看該作者

我的直覺string是靠RefCount來判斷是否COW, 看來得在查查.

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

reve 該用戶已被刪除	4樓 [報(bào)告] 發(fā)表于 2006-09-26 23:12 \|只看該作者提示: 作者被禁止或刪除內(nèi)容自動(dòng)屏蔽
reve 該用戶已被刪除	實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門\| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 \| ChinaUnix打賞功能已上線！ \| 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？