2G还是40G?Linux到底需要多少虚拟内存?

发表于:2007-05-26来源:作者:点击数: 标签:
最近遇到了一个大内存(4G)情况下,Linux分配多少虚拟内存较为合适的问题。Linux的手册中,包括RHEL3、Fedora、Debain在内,都说明交换区不得超过2G。这是一个非常模糊的说法:究竟是单个交换区或交换文件不得超过2G,还是整个系统的交换区总和不得超过2G?手
最近遇到了一个大内存(4G)情况下,Linux分配多少虚拟内存较为合适的问题。Linux的手册中,包括RHEL3、Fedora、Debain在内,都说明交换区不得超过2G。这是一个非常模糊的说法:究竟是单个交换区或交换文件不得超过2G,还是整个系统的交换区总和不得超过2G?手册中的含糊表述无法给我满意答复。

我在google上以swap size limit 2G的关键字进行搜索,发现了许多讨论的帖子,他们常常纠缠于是否需要这样大内存这种话题,或者如泥牛入海,没有最终的答复。

翻阅Linus和他的朋友们在2001年就此问题的一篇讨论记录,可以找到权威的答案。在这篇访谈中,Linus和他的朋友们,最终颠覆了2*RAM SIZE的传统建议。

一、关于2G虚拟内存限制的说明

1. 早期的linux对虚拟内存的限制
linux2.2以前的内核,支持最大128M的Linux swap分区或文件。Linux swap的分区或文件总数不超过16个。
所以在linux2.2以前,可用的最大虚拟内存为128M*8=1G。


2. Linux2.2.x对虚拟内存的限制
Linux2.2.x支持最大2G的Linux swap分区或文件。Linux swap的分区或文件总数不超过8个。
所以在Linux2.2.x,可用的最大虚拟内存为 2G*8=16G


3. Linux2.4.x对虚拟内存的限制
Linux2.4.10之前支持最大2G的Linux swap分区或文件。Linux swap的分区或文件总数不超过8个。
所以在 Linux2.4.10之前,可用的最大虚拟内存为 2G*8=16G

Linux2.4.10(含)之后支持最大2G的Linux swap分区或文件。Linux swap的分区或文件总数不超过32个。
所以在 Linux2.4.10之后,可用的最大虚拟内存为 2G*32=64G

Linux2.4.x在内存管理策略上,就如Windows98向Windows2000转变一样,在内存和交换区中保存了更多的脏页,而不是及时回收内存,以此提高系统的效率。(见Widnows核心编程第18章的论述)。

此外,如果你有超过1块磁盘,并且分别在不同的磁盘上建立了swap,那么linux会按照raid 0的方式来使用这些交换分区。



二、多少虚拟内存较好?

1. Linux 2.2.x(含)及以前
传统的2倍虚拟内存的观点是有效的。建议按照此方法进行分配。

2. Linux 2.4.x(含)及以后
在考虑合适的硬盘费用的情况下,对虚拟内存的需求是多多益善。

Linus明确的指出,即使是512M内存,也可以分配高达40G的交换区,以提高系统的性能。Zlatko 在向Linus提出性能的质疑后,自己进行了验证。Zlatko通过实验表明:大交换区策略没有增加磁盘I/O的费用。




三、实例

一台IBM x365服务器,配置4G内存。
1. 交换区最小不低于4G
建立两个各为2G的交换区,做为基础的4G交换分区

2. 建立8个2G的交换文件,做为扩展的交换分区
这样总的交换分区大概在20G左右,如果硬盘更大,可以增加最多。


题外话:关于Windows中的虚拟内存使用
自Windows2000开始,微软也改变了内存使用策略。他们会尽可能较迟的回收内存。因此,我个人相信大的交换区,对于Windows2000以上的系统也是更有效率的。


附:Linus的谈话录
2. Greater 2.4 Swap Requirements

7 Jan 2001 - 18 Jan 2001 (100 posts) Archive Link: "Subtle MM bug"
Topics: Virtual Memory
People: Rik van Riel, Linus Torvalds, Eric W. Biederman, Zlatko Calusic

In the course of discussion, it became clear that Linux 2.4.x required more swap than previous versions. Rik van Riel mentioned, "2.4 keeps dirty pages in the swap cache, so you will need more swap to run the same programs..." He asked Linus Torvalds, "is this something we want to keep or should we give the user the option to run in a mode where swap space is freed when we swap in something non-shared ?" Linus replied:

I'd prefer just documenting it and keeping it. I'd hate to have two fairly different modes of behaviour. It's always been the suggested "twice the amount of RAM", although there's historically been the "Linux doesn't really need that much" that we just killed with 2.4.x.

If you have 512MB of RAM, you can probably afford another 40GB or so of harddisk. They are disgustingly cheap these days.


Zlatko Calusic worried that more data in swap would degrade performance because the disk head would need more seek time to find data. He asked if Linus was sure this would be okay, and Linus replied, "I'm not _sure_, obviously. However, one thing I _am_ sure of is that the sticky page-cache simplifies some things enormously, and make some things possible that simply weren't possible before." . But in a nearby post he admitted, "the sticky allocation _might_ make the IO we do be more spread out." He felt it was important to consider these kinds of potential downsides, though he felt that in this case the benefits outweighed the drawbacks; and at one point Eric W. Biederman explained suclearcase/" target="_blank" >ccinctly, "The tradeoff when implemented correctly is that writes will tend to be more spread out and reads should be better clustered together."


Zlatko ran some tests, and could not find any problems with the 2.4.0 memory management logic, though he added, "I have found that new kernel allocates 4 times more swap space under some circumstances. That may or may not be alarming, it remains to be seen." At one point, Linus gave his overall take on 2.2/2.4 performance issues. He said:

I personally think 2.4.x is going to be as fast or faster at just about anything. We do have some MM issues still to hash out, and tuning to do, but I'm absolutely convinced that 2.4.x is going to be a _lot_ easier to tune than 2.2.x ever was. The "scan the page tables without doing any IO" thing just makes the 2.4.x memory management several orders of magnitude more flexible than 2.2.x ever was.

(This is why I worked so hard at getting the PageDirty semantics right in the last two months or so - and why I released 2.4.0 when I did. Getting PageDirty right was the big step to make all of the VM stuff possible in the first place. Even if it probably looked a bit foolhardy to change the semantics of "writepage()" quite radically just before 2.4 was released).

Elsewhere, he considered the case of swapless or low-swap machines:

If you don't have any swap, or if you run out of swap, the major difference between 2.2.x and 2.4.x is probably going to be the oom handling: I suspect that 2.4.x might be more likely to kill things off sooner (but it tries to be graceful about which processes to kill).

Not having any swap is going to be a performance issue for both 2.2.x and 2.4.x - Linux likes to push inactive dirty pages out to swap where they can lie around without bothering anybody, even if there is no _major_ memory crunch going on.

If you do have swap, but it's smaller than your available physical RAM, I suspect that the Linux-2.4 swap pre-allocate may cause that kind of performance degradation earlier than 2.2.x would have. Another way of putting this: in 2.2.x you could use a fairly small swap partition to pick up some of the slack, and in 2.4.x a really small swap-partition doesn't really buy you much anything.


下面是我对其中一些论点的翻译,由于本人英文水平的原因,一定会有不妥的地方。欢迎斧正。

在下面的讨论中,清楚的说明了Linux2.4版本比它以前的版本需要更多的交换区。Rik val Riel提醒说:“2.4(内核) 在交换区缓存中,保持了更多的脏页,所以对于相同的程序,你需要更多的内存来运行它...”。他问Linus Torvalds,“这是一些东西,我们需要保持,或者我们将告诉用户,将要运行在某一种模式,当交换区被释放,当我们在一些非共享区进行交换时?”Linus回答:“我乐于看见用文档说明这点,并且将它保持下去。我痛恨两种不公平的行为方式。它总是建议“两倍于内存(的交换区大小)”,尽管在历史上被告知,“Linux 实际上不需要那样多”,我们在linux2.4.x中,把这种观点真正的抛弃了。

如果你有512M内存,你可以要求40G的交换区,他们在今天已经变得非常便宜。


Zlatko Calusic 担心,更大的交换区会降低性能,因为磁盘需要更多的时间进行搜索,以找到数据。他问Linus,是否确认这种现象不会发生,Linus回答:“我不 _确认_,众所周知,无论如何,我只确认一件事情,即页面缓冲的粘着性,能够使许多事情变得简化,并且使此前不可能的一些事情变得可能。”但是在最近,他补充道:“粘着建起可能会使我们的IO更为分散”。

Zlatko进行了一些测试,并没有发现2.4.0版本的内存管理策略有什么问题。他说道:“我发现新内核在某些环境下,会申请4倍的交换空间。这或许是,也或许不是问题,它仍然保持可见性。

原文转自:http://www.ltesting.net