[RFC] Using DC in amdgpu for upcoming GPU

Kevin Brace kevinbrace at gmx.com
Thu Dec 15 15:48:39 UTC 2016


Hi,

I have been reading the ongoing discussion about what to do about AMD DC (Display Core) with great interest since I have started to put more time into developing OpenChrome DRM for VIA Technologies Chrome IGP.
I particularly enjoyed reading what Tony Cheng wrote about what is going on inside AMD Radeon GPUs.
As a graphics stack developer, I suppose I am still someone somewhat above a beginner level, and Chrome IGP might be considered garbage graphics to some (I do not really care what people say or think about it.), but since my background is really digital hardware design (self taught) rather than graphics device driver development, I will like to add my 2 cents (U.S.D.) to the discussion.
I also consider myself an amateur semiconductor industry historian, and in particular, I have been a close watcher of Intel's business / hiring practice for many years. 
For some, what I am writing may not make sense or even offend some (my guess will be the people who work at Intel), but I will not pull any punches, and if you do not like what I write, let me know. (That does not mean I will necessarily take back my comment even if it offended you. I typically stand behind what I say, unless it is obvious that I am wrong.)
    While my understanding of DRM is still quite primitive, my simplistic understanding of why AMD is pushing DC is due to the following factors.

1) AMD is understaffed due to its precarious financial condition it is in right now (i.e., < $1 billion CoH and losing 7,000 employees since Year 2008 or so)
2) The complexity of the next generation ASIC is only getting worse due to the continuing process scaling = more transistors one has to use (i.e., TSMC 28 nm to GF 14 nm to probably Samsung / TSMC 10 nm or GF 7 nm)
3) Based on 1 and 2, unless the design productively can be improved, AMD will be late to market, and this can be the possible end to AMD as a corporation
4) Hence, in order to meet TtM and improve engineer productivity, AMD needs to reuse the existing pre-silicon / post-silicon bring up test code and share the code with the Windows side of the device driver developers
5) In addition, power is already the biggest design challenge, and very precise power management is crucial to the performance of the chip (i.e., it's not all about the laptop anymore, and desktop "monster" graphics cards also need power management for performance reasons, in order to manage heat generation)
6) AMD Radeon is really running an RTOS (Real Time Operating System) inside the GPU card, and they want to put the code to handle initialization / power management closer to the GPU rather than from the slower response x86 (or any other general purpose) microprocessor


Since I will probably need to obtain "favors" down the road when I try to get OpenChrome DRM mainlined, I probably should not go into what I think of how Intel works on their graphics device driver stack (I do not mean to make this personal, but Intel is the "other" open source camp in the OSS x86 graphics world, so I find it a fair game to discuss the approach Intel takes from semiconductor industry perspective. I am probably going to overly generalize what is going on, so if you wanted to correct me, let me know.), but based on my understanding of how Intel works, Intel probably has more staffing resources than AMD when it comes to graphics device driver stack development. (and on the x86 microprocessor development side)
Based on my understanding of where Intel stands financially, I feel like Intel is standing on very thin ice due to the following factors, and I will predict that they will eventually adopt AMD DC like design concept. (i.e., use of a HAL)
Here is my logic.

1) PC (desktop and laptop) x86 processors are not selling very well, and my understanding is that since Year 2012 peak, x86 processor shipment is down 30% as of Year 2016 (I will say around $200 ASP)
2) Intel's margins are being propped up by the unnaturally high data center marketshare (99% for x86 data center microprocessors) and very high data center x86 processor ASP (Average Selling Price) of $600 (Up from $500 a few years ago due to AMD screwing up the Bulldozer microarchitecture. More on this later.)
3) Intel did a significant layoff in April 2016 where they targeted older (read "expensive"), experienced engineers
4) Like Cisco Systems (notorious for their annual summer time 5,000 layoff), Intel then turns around and goes in a hiring spree hiring from many graduate programs of U.S. second and third tier universities, bringing down the overall experience level of the engineering departments
5) While AMD is financially in a desperate shape, it will likely have one last chance in Zen microarchitecture to get back into the game (Zen will be the last chance for AMD, IMO.)
6) Since AMD is now fabless due to divestiture of the fabs in Year 2009 (GLOBALFOUNDRIES), it no longer has the financial burden of having to pay for the fab, whereas Intel "had to" delay 10 nm process deployment to 2H'17 due to weak demand of 14 nm process products and low utilization of 14 nm process (Low utilization delays the amortization of 14 nm process. Intel historically amortized the given process technology in 2 years. 14 nm is starting to look like 2.5 to 3 years due to yield issues they encountered in 2014.)
7) Inevitably, the magic of market competition will drag down Intel ASP (both PC and data center) since Zen microarchitecture is a rather straight forward x86 microarchitectural implementation (i.e., not too far apart from Skylake), hence, their low 60% gross margin will be under pressure from AMD starting in Year 2017.
8) Intel overpaid for Altera (a struggling FPGA vendor where the CEO probably felt like he had to sell the corporation in order to cover up the Stratix 10 FPGA development screw up of missing the tape out target date by 1.5 years) by $8 billion, and the next generation process technology is getting ever more expensive (10 nm, 7 nm, 5 nm, etc.)
9) In order to "please" Wall Street, Intel management will possibly do further destructive layoffs every year, and if I were to guess, will likely layoff another 25,000 to 30,000 people over the next 3 to 4 years
10) Intel has already lost the experienced engineers over the past layoffs, replacing them with far less experienced engineers hired relatively recently from mostly second and third tier U.S. universities
11) Now, with 25,000 to 30,000 layoff, the management will force the software engineering side to reorganize, and Intel will be "forced" to come up with ways to reuse their graphics stack code (i.e., sharing more code between Windows and Linux)
12) Hence, maybe a few years from now, Intel people will have to do something similar to AMD DC, in order to improve their design productivity since they no longer can throw people at the problem (Their tendency to overhire new college graduates since they are cheaper, and this allowed them to throw people at the problem relatively cheaply until recently. High x86 ASP also allowed them to do this as well, and they got too used to this for too long. They will not be able to do this in the future. In the meantime, their organizational experience level is coming down due to hiring too many NCGs and laying off too many experienced people at the same time.)


I am sure there are people who are not happy reading this, but this is my harsh, honest assessment of what Intel is going through right now, and what will happen in the future.
I am sure I will be effectively blacklisted from working at Intel for writing what I just wrote (That's okay since I am not interested in working at Intel.), but I came to this conclusion based on various people who used to work at Intel told me and observing the hiring practice of Intel for a number of years.
In particular, one person who worked on Intel 740 project (i.e., the long forgotten discrete AGP graphics chip from 1998) on the technical side has told me that Intel is really terrible at IP (Intellectual Property) core reuse, and Intel frequently redesigns too many portions of their ASICs all the time.
Based on that, I am not too surprised to hear that Intel does Windows and Linux graphics device driver stack development separately. (That's what I read.)
In other words, Intel is bloated from a staffing point of view. (I do not necessarily like people to lose jobs, but compared to AMD and NVIDIA, Intel is really bloated. The same person who worked on the Intel 740 project told me that Intel employee productivity is much lower than their competitors like AMD and NVIDIA on a per employee basis, and they have not been able to fix this for years.)
Despite the constant layoffs, Intel's employee count has not really gone down for the past few years (it is staying around 100,000 for the past 4 years), but eventually Intel will have to get rid of people in absolute numbers.
Intel also heavily relies on its "shadow" workforce of interns (from local universities, especially the foreign master's degree students desperate to pay off part of their high out of state tuition) and contractors / consultants, so their "real" employee count is probably closer to 115,000 or 120,000.
I get Intel related contractor / consultant position "unsolicited" e-mails from recruiters possibly located 12 time zones away from where I reside (please do not call me a racist for pointing this out since I find this so weird as a U.S. citizen) almost every weekday (M-F), and I am always surprised at the type of work Intel wants contractors to work on.
Many of the positions they want people to work are highly specialized stuff (I saw a graphics device driver contract position recently.), and they have been like this for several years already.
I no longer bother with Intel anymore based on this since they appear to not want to commit to proper employment of highly technical people.
Going back to the graphics world, my take is, Intel will have to get used to doing the same with far fewer people, and they will need to change their corporate culture of throwing people at the problem very soon since their x86 ASP will be crashing down fairly soon, and AMD will likely never repeat the Bulldozer microarchitecture screw up again. (Intel got lucky when former IBM PowerPC architects AMD hired around Year 2005 screwed up the Bulldozer. Speed Demon design is a disaster in a power constrained post-90 nm process node. They tried to compensate for Bulldozer's low IPC with high clock frequency. Intel learned a painful lesson about power with NetBurst microarchitecture between Year 2003 to 2005. Also, then AMD management seem to have really believed in the many-core concept too seriously. AMD had to live with the messed up Bulldozer for 10+ years with disastrous financial results.)
    I do understand that what I am writing isn't terribly technical in nature (it is more like corporate strategy stuff business / marketing side people worry about), but I feel like what AMD is doing is quite logical. (i.e., using higher abstraction level for initialization / power management, and code reuse)
Sorry for the off topic assessment of Intel (i.e., hiring practice stuff, x86 stuff), and based on the subsequent messages, it appears that DC can be rearchitected to satisfy Linux kernel developers, but overall, I feel like there is a lack of appreciation for the concept of design reuse in this case even though in ASIC / FPGA design world, this is very normal. (It has been like this since the mid-'90s when ASIC engineers had to start doing this regularly.)
AMD side people appeared to have been trying to apply this concept to the device driver side as well.
Considering AMD's meager staffing resources (currently approximately 9,000; less than 1/10 of Intel although Intel owns many fabs and product lines, so the actual developer staffing disadvantage is probably more like 1:3 to 1:5 ratio), I am not too surprised to read that it is trying to improve their productivity where they can, and combining some portions of Windows and Linux code makes sense.
I would imagine that NVIDIA is going something like this already. (but closed source)
Again, I will almost bet that Intel will adopt AMD DC like concept in the next few years.
Let me know if I was right in a few years.

Regards,

Kevin Brace
The OpenChrome Project maintainer / developer


More information about the dri-devel mailing list