Computer Systems: A Programmer's Perspective (3rd Edition)
3rd Edition
ISBN: 9780134092669
Author: Bryant, Randal E. Bryant, David R. O'Hallaron, David R., Randal E.; O'Hallaron, Bryant/O'hallaron
Publisher: PEARSON
expand_more
expand_more
format_list_bulleted
Concept explainers
Textbook Question
Chapter 5, Problem 5.14HW
Write a version of the inner product procedure described in Problem 5.13 that uses 6 × 1 loop unrolling. For x86-64, our measurements of the unrolled version give a CPE of 1.07 for integer data but still 3.01 for both floating-point data.
- A. Explain why any (scalar) version of an inner product procedure running on an Intel Core i7 Haswell processor cannot achieve a CPE less than 1.00.
- B. Explain why the performance for floating-point data did not improve with loop unrolling.
Expert Solution & Answer
Want to see the full answer?
Check out a sample textbook solutionStudents have asked these similar questions
Write a counting program in MIPS assembly. The program should print the first sixteen powers of 2 beginning with 2^0 with a space between each value. The output of the program should be exactly as follows:
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768
Additional requirements for this problem: You must use a logical shift operation in your implementation. You must use a loop (i.e., a conditional branch that causes program instructions to repeat).
NAND2TETRIS HARDWARE SIMULATOR (HARDWARE DESCRIPTION LANGUAGE (HDL)) ,USING SKELETON PROGRAM PROVIDED AND USING PREDIFINED GATES ATTACHED.
CHIP HiLoMux
{
IN in[8], sel;
OUT out[4];
PARTS:
}
Implement HiLoMux - This has one 8-bit input bus, in, and one 4-bit output bus, out. Alsopresent is a sel input, which is used to select what appears on out. Ifsel is false, then out should contain the lower 4-bits of in (i.e. in[0],in[1], in[2], in[3]). If sel is true, then out should contain theupper 4-bits of in (i.e. in[4] mapped to out[0], in[5], mapped toout[1], etc.).
Write a Verilog code with testbench for 16-bit up/down counter with synchronous reset and synchronous up/down.If up/down is set the counter is up counter and if it is not set, the counter is a down counter.Execute the module code, testbench code, and the simulation results. PLEASE EXECUTE CODE IN ICARUS
Chapter 5 Solutions
Computer Systems: A Programmer's Perspective (3rd Edition)
Additional Engineering Textbook Solutions
Find more solutions based on key concepts
Given that y=ax3+7, which of the following are correct Java statements for this equations? int y = (a x) x (...
Java How to Program, Early Objects (11th Edition) (Deitel: How to Program)
Explain why the rapid delivery and deployment of new systems is often more important to businesses than the det...
Software Engineering (10th Edition)
// This program displays the sum of two numbers. #include iostream using namespace std; int main() { int choice...
Starting Out with C++ from Control Structures to Objects (9th Edition)
Hotel Occupancy A hotels occupancy rate is calculated as follows: Occupancyrate=NumberofroomsoccupiedTotalnumbe...
Starting Out with Java: From Control Structures through Data Structures (4th Edition) (What's New in Computer Science)
The ________ object is assumed to exist and it is not necessary to include it as an object when referring to it...
Web Development and Design Foundations with HTML5 (8th Edition)
Polymorphism allows a class variable of the superclass type to reference objects of either the superclass or th...
Starting Out with Programming Logic and Design (5th Edition) (What's New in Computer Science)
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- Develop an ALP 8086 with DOS functions to print K to Z, 5 timesarrow_forwardThere is an application that requires the following hardware: an Intel 8031, a Program ROM of 8Kx8, a Data ROM of 4Kx8 for look-up tables and a Data RAMs of 8Kx8. The memory map of the design: Program ROM should start at address 0000H. Then, the Data ROM should come above the Program ROM. Finally the Data RAM must go to the top of the memory map. There should be no gaps between the memory addresses of the external ROMs. Calculate the address space of the ROMs and RAMs of your design.arrow_forwardThere is an application that requires the following hardware: an Intel 8031, a Program ROM of 8Kx8, a Data ROM of 4Kx8 for look-up tables and a Data RAMs of 8Kx8. The memory map of the design: Program ROM should start at address 0000H. Then, the Data ROM should come above the Program ROM. Finally the Data RAM must go to the top of the memory map. There should be no gaps between the memory addresses of the external ROMs. Show the design’s address space on a memory map, starting with 0000H at the bottom and FFFFH at the top.arrow_forward
- Implement the following pseudocode in x86 assembly language. Use short-circuit evaluation and assume that num1 and RESULT are 32-bit variables. this is for an assembly x86 class.arrow_forwardWrite code to implement the expression A = (B + C) × (D + E) on three-, two-, one-, and zero-address machines. In accordance with programming language practice, computing the expression should not change the values of its operands.arrow_forwardWrite a Verilog code with testbench for 16-bit up/down counter with synchronous reset and synchronous up/down.If up/down is set the counter is up counter and if it is not set, the counter is a down counter. Execute the module code, testbench code, and the simulation results. PLEASE EXECUTE CODE IN VERILOGarrow_forward
- Write a assembly code for 8808/8086 that print the array db 2,11,13,5,20 (Important note the number 11,13,20 two digits) The code for any input And print the number of two digitarrow_forwardWrite a program in PLP assembly that repeatedly reads the value of the switches (address: 0xf0100000) and displays a pattern on the LED array based on what switches. Each time the switch value is read, the pattern should be displayed (regardless of whether the switch value has changed or not since the last time it was read). The table below indicates the pattern that should be displayed for each possible switch setting. - please use"sw" "li", labels, loops, "beq" and "lw" to create the code. The code is for assambly language and it will be compiled and run on PLPTool version 5.2. I have created the following code it is not completed it. I seem to be stuck here. Please complete it, following the task instructions. # main source file .org 0x10000000 _start: # Initialization li $t0,0xf0100000 # Load address of switches li $t1,0xf0200000 # Load address of LEDs li $t5,1 # Load value of switch 0 address of LEDs li $t6,2 # Load value of switch 1 address of LEDs…arrow_forwardImplement a new unary instruction in place of N0P0 called ASL2 that does two left shifts on the accumulator. V should remain unchanged, but N and Z should correlate with the new value in the accumulator, and C should be the carry from the second shift. Write a program that tests all the features of the new instruction.arrow_forward
- Write a assembly code for 8086/8088 that replace the contents of each cell with the sum of the contents of all the cells in the original array from the left end to the cell in question. Thus, for example, if the array passed to the function looks like this: 52937 then when the function returns, the array will have been changed so that it looks like this: 5 7 16 19 26arrow_forwardHow can you teach an old dog new tricks? How can a bird in the hand be worth two in the bush? How can you have your cake and eat it too? How can a serial in/parallel out register kill two birds with one stone and be used as a serial in/serial out register?arrow_forwardOpenMP C++ With the following code, create 3 versions: Outer loop parallelism: use a single OpenMP pragma only at the outer loop Inner loop parallelism: use a single OpenMP pragma only at the inner loop (use reduction) Nested loop parallelism: use pragmas at both the outer loop and inner loop ----- #pragma omp parallel for for(int i = 0; i < n; i++) { #pragma omp parallel for for(int j = 0; j < n; j++) { y[i] += A[i * n + j] * x[j]; } }arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Computer Fundamentals - Basics for Beginners; Author: Geek's Lesson;https://www.youtube.com/watch?v=eEo_aacpwCw;License: Standard YouTube License, CC-BY