3DMGAME 3DM首页 新闻中心 前瞻 | 评测 游戏库 热门 | 最新 攻略中心 攻略 | 秘籍 下载中心 游戏 | 汉化 购买正版 论坛

注册 登录

QQ登录

只需一步,快速开始

查看: 5374|回复: 21
打印 上一主题 下一主题

[讨论] 关于970显存占用大于3.5GB之后速度骤减的官方解释

  [复制链接]

49

主题

1332

帖子

1359

积分

游戏狂人

Rank: 6Rank: 6

贡献度
43
金元
11868
积分
1359
精华
0
注册时间
2011-4-2
跳转到指定楼层
主题
发表于 2015-1-27 18:16 | 只看该作者 |只看大图 回帖奖励 |倒序浏览 |阅读模式
除了2011dm在帖子http://bbs.3dmgame.com/thread-4608556-1-1.html提到过的,这里补充两点原文的说明,
由于970最后一组的L2缓存/ROP被砍,而与最后一组对应的SMMs和DRAM却是完整的,因此这组接口(port)可能会承受2倍请求(因为少了一组L2/ROP处理,却有两个显存模块)。


平时没有满载还好,一旦第7端口(见下图)满载,第8端口产生新的请求,那其他6个端口处理时间就多了一倍,因为要等待第7端口处理2份任务,直接波及到剩下3.5GB数据的吞吐效率。所以NV想了个办法,在970里把之前3.5GB显存映射到一块儿,把最后的0.5GB独立开。一般使用只要不超过3.5GB,数据就都在之前这段显存里流通,避免把数据放到最后的0.5GB里。


而当显存占用超过3.5GB的时候,结果就像前文提到的第7端口的请求处理量比其他1-6个端口处理量多了一倍,于是整体处理速度骤降。单独访问最后的0.5GB显存的速度只有之前3.5GB速度的7分之一。因为它无法和之前3.5GB显存同步存取。




原文链接:http://www.pcper.com/reviews/Graphics-Cards/NVIDIA-Discloses-Full-Memory-Structure-and-Limitations-GTX-970

相关内容节选:

But the GTX 970 and its altered design has to do things differently. If you walked across the memory interface in the exact same way, over the same 4GB capacity, the 7th crossbar port would tend to always get twice as many requests as the other port (because it has two memories attached).  In the short term that could be ok due to queuing in the memory path.  But in the long term if the 7th port is fully busy, and is getting twice as many requests as the other port, then the other six must be only half busy, to match with the 2:1 ratio.  So the overall bandwidth would be roughly half of peak. This would cause dramatic underutilization and would prevent optimal performance and efficiency for the GPU.

Let's be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory.
To avert this, NVIDIA divided the memory into two pools, a 3.5GB pool which maps to seven of the DRAMs and a 0.5GB pool which maps to the eighth DRAM.  The larger, primary pool is given priority and is then accessed in the expected 1-2-3-4-5-6-7-1-2-3-4-5-6-7 pattern, with equal request rates on each crossbar port, so bandwidth is balanced and can be maximized. And since the vast majority of gaming situations occur well under the 3.5GB memory size this determination makes perfect sense. It is those instances where memory above 3.5GB needs to be accessed where things get more interesting.

回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

Archiver|手机版|3DMGAME ( 京ICP备14006952号-1  沪公网安备 31011202006753号

GMT+8, 2026-4-8 13:34 , Processed in 0.029827 second(s), 22 queries , Memcached On.

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表