Power and Performance: Software Analysis and Optimization

Ebook557 pages4 hours

Power and Performance: Software Analysis and Optimization

Name: Power and Performance: Software Analysis and Optimization
Author: Jim Kukunas
ISBN: 9780128008140

By Jim Kukunas

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Power and Performance: Software Analysis and Optimization is a guide to solving performance problems in modern Linux systems. Power-efficient chips are no help if the software those chips run on is inefficient. Starting with the necessary architectural background as a foundation, the book demonstrates the proper usage of performance analysis tools in order to pinpoint the cause of performance problems, and includes best practices for handling common performance issues those tools identify.

Provides expert perspective from a key member of Intel’s optimization team on how processors and memory systems influence performance
Presents ideas to improve architectures running mobile, desktop, or enterprise platforms
Demonstrates best practices for designing experiments and benchmarking throughout the software lifecycle
Explains the importance of profiling and measurement to determine the source of performance issues

Skip carousel

Software Development & Engineering

LanguageEnglish

PublisherElsevier Science

Release dateApr 27, 2015

ISBN9780128008140

Author

Jim Kukunas

Jim Kukunas began programming at a young age, teaching himself C and x86 assembly. He is an alumnus of Allegheny College with a degree in Computer Science. Today, he is a software engineer in Intel's Open Source Technology Center. As a performance optimization engineer on the core Linux kernel team, much of his work focuses on kernel space and user space performance optimizations. His efforts have enhanced many projects including the Linux kernel, Zlib, the Englightenment Foundation Libraries, Meego, Android, and many others.

Related authors

Skip carousel

Related to Power and Performance

Related ebooks

Skip carousel

Information Theory: Coding Theorems for Discrete Memoryless Systems
Ebook
Information Theory: Coding Theorems for Discrete Memoryless Systems
byImre Csiszár
Rating: 5 out of 5 stars
5/5
Cacti 0.8 Network Monitoring
Ebook
Cacti 0.8 Network Monitoring
byDinangkur Kundu
Rating: 0 out of 5 stars
0 ratings
Flash Memory Integration: Performance and Energy Issues
Ebook
Flash Memory Integration: Performance and Energy Issues
byJalil Boukhobza
Rating: 0 out of 5 stars
0 ratings
Readings in Hardware/Software Co-Design
Ebook
Readings in Hardware/Software Co-Design
byElsevier Books Reference
Rating: 3 out of 5 stars
3/5
PCIe Standard Requirements
Ebook
PCIe Standard Requirements
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Tcl/Tk 8.5 Programming Cookbook
Ebook
Tcl/Tk 8.5 Programming Cookbook
byBert Wheeler
Rating: 5 out of 5 stars
5/5
OpenVMS System Management Guide
Ebook
OpenVMS System Management Guide
byLawrence Baldwin
Rating: 0 out of 5 stars
0 ratings
Applied Numerical Methods Using MATLAB
Ebook
Applied Numerical Methods Using MATLAB
byWon Y. Yang
Rating: 0 out of 5 stars
0 ratings
Free & Opensource Video Editor Software For Windows, Ubuntu Linux & Macintosh
Ebook
Free & Opensource Video Editor Software For Windows, Ubuntu Linux & Macintosh
by"Cyber Jannah" "Studio"
Rating: 0 out of 5 stars
0 ratings
The Designer's Guide to the Cortex-M Processor Family: A Tutorial Approach
Ebook
The Designer's Guide to the Cortex-M Processor Family: A Tutorial Approach
byTrevor Martin
Rating: 5 out of 5 stars
5/5
PIC Microcontrollers: An Introduction to Microelectronics
Ebook
PIC Microcontrollers: An Introduction to Microelectronics
byMartin P. Bates
Rating: 5 out of 5 stars
5/5
Embedded Systems: ARM Programming and Optimization
Ebook
Embedded Systems: ARM Programming and Optimization
byJason D. Bakos
Rating: 0 out of 5 stars
0 ratings
Principles of Transaction Processing
Ebook
Principles of Transaction Processing
byPhilip A. Bernstein
Rating: 4 out of 5 stars
4/5
Rugged Embedded Systems: Computing in Harsh Environments
Ebook
Rugged Embedded Systems: Computing in Harsh Environments
byAugusto Vega
Rating: 4 out of 5 stars
4/5
Debugging Embedded and Real-Time Systems: The Art, Science, Technology, and Tools of Real-Time System Debugging
Ebook
Debugging Embedded and Real-Time Systems: The Art, Science, Technology, and Tools of Real-Time System Debugging
byArnold S. Berger
Rating: 5 out of 5 stars
5/5
ASP.NET Core 1.0 High Performance
Ebook
ASP.NET Core 1.0 High Performance
byJames Singleton
Rating: 0 out of 5 stars
0 ratings
PowerPC Microprocessor Common Hardware Reference Platform: A System Architecture
Ebook
PowerPC Microprocessor Common Hardware Reference Platform: A System Architecture
byApple Computer, Inc.
Rating: 4 out of 5 stars
4/5
Network Simulation Experiments Manual
Ebook
Network Simulation Experiments Manual
byEmad Aboelela
Rating: 5 out of 5 stars
5/5
Configuration Management for Senior Managers: Essential Product Configuration and Lifecycle Management for Manufacturing
Ebook
Configuration Management for Senior Managers: Essential Product Configuration and Lifecycle Management for Manufacturing
byFrank B. Watts
Rating: 0 out of 5 stars
0 ratings
Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications
Ebook
Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications
byRobert Oshana
Rating: 3 out of 5 stars
3/5
System on Chip Interfaces for Low Power Design
Ebook
System on Chip Interfaces for Low Power Design
bySanjeeb Mishra
Rating: 0 out of 5 stars
0 ratings
High Performance Parallelism Pearls Volume One: Multicore and Many-core Programming Approaches
Ebook
High Performance Parallelism Pearls Volume One: Multicore and Many-core Programming Approaches
byJames Reinders
Rating: 0 out of 5 stars
0 ratings
DSP for Embedded and Real-Time Systems
Ebook
DSP for Embedded and Real-Time Systems
byRobert Oshana
Rating: 5 out of 5 stars
5/5
Post-Silicon Validation and Debug
Ebook
Post-Silicon Validation and Debug
byPrabhat Mishra
Rating: 0 out of 5 stars
0 ratings
Pro .NET Benchmarking: The Art of Performance Measurement
Ebook
Pro .NET Benchmarking: The Art of Performance Measurement
byAndrey Akinshin
Rating: 0 out of 5 stars
0 ratings
Electric Drives and Electromechanical Systems: Applications and Control
Ebook
Electric Drives and Electromechanical Systems: Applications and Control
byRichard Crowder
Rating: 0 out of 5 stars
0 ratings
Real World Multicore Embedded Systems
Ebook
Real World Multicore Embedded Systems
byBryon Moyer
Rating: 3 out of 5 stars
3/5
Service Desk Analyst Bootcamp: Maintaining, Configuring And Installing Hardware And Software
Ebook
Service Desk Analyst Bootcamp: Maintaining, Configuring And Installing Hardware And Software
byRob Botwright
Rating: 0 out of 5 stars
0 ratings
Designing SCADA Application Software: A Practical Approach
Ebook
Designing SCADA Application Software: A Practical Approach
byStuart G McCrady
Rating: 0 out of 5 stars
0 ratings
Fast and Effective Embedded Systems Design: Applying the ARM mbed
Ebook
Fast and Effective Embedded Systems Design: Applying the ARM mbed
byTim Wilmshurst
Rating: 5 out of 5 stars
5/5

Software Development & Engineering For You

Skip carousel

Python For Dummies
Ebook
Python For Dummies
byStef Maruch
Rating: 4 out of 5 stars
4/5
How to Write Effective Emails at Work
Ebook
How to Write Effective Emails at Work
byRamakrishna Reddy
Rating: 4 out of 5 stars
4/5
Learning Python
Ebook
Learning Python
byRomano Fabrizio
Rating: 5 out of 5 stars
5/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Hand Lettering on the iPad with Procreate: Ideas and Lessons for Modern and Vintage Lettering
Ebook
Hand Lettering on the iPad with Procreate: Ideas and Lessons for Modern and Vintage Lettering
byLiz Kohler Brown
Rating: 4 out of 5 stars
4/5
Beginning C++ Programming
Ebook
Beginning C++ Programming
byRichard Grimes
Rating: 3 out of 5 stars
3/5
Adobe Illustrator CC For Dummies
Ebook
Adobe Illustrator CC For Dummies
byDavid Karlins
Rating: 5 out of 5 stars
5/5
iOS App Development For Dummies
Ebook
iOS App Development For Dummies
byJesse Feiler
Rating: 0 out of 5 stars
0 ratings
Level Up! The Guide to Great Video Game Design
Ebook
Level Up! The Guide to Great Video Game Design
byScott Rogers
Rating: 4 out of 5 stars
4/5
Tiny Python Projects: Learn coding and testing with puzzles and games
Ebook
Tiny Python Projects: Learn coding and testing with puzzles and games
byKen Youens-Clark
Rating: 5 out of 5 stars
5/5
Ry's Git Tutorial
Ebook
Ry's Git Tutorial
byRyan Hodson
Rating: 0 out of 5 stars
0 ratings
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
Lua Game Development Cookbook
Ebook
Lua Game Development Cookbook
byMário Kašuba
Rating: 0 out of 5 stars
0 ratings
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
Ebook
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
byMichał Jaworski
Rating: 0 out of 5 stars
0 ratings
27 PROGRAM MANAGEMENT INTERVIEW TECHNIQUES - To Ace That Dream Job Offer !
Ebook
27 PROGRAM MANAGEMENT INTERVIEW TECHNIQUES - To Ace That Dream Job Offer !
byKumar Saurabh
Rating: 5 out of 5 stars
5/5
Reversing: Secrets of Reverse Engineering
Ebook
Reversing: Secrets of Reverse Engineering
byEldad Eilam
Rating: 4 out of 5 stars
4/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byChris Minnick
Rating: 0 out of 5 stars
0 ratings
RESTful API Design - Best Practices in API Design with REST: API-University Series, #3
Ebook
RESTful API Design - Best Practices in API Design with REST: API-University Series, #3
byMatthias Biehl
Rating: 5 out of 5 stars
5/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
Good Code, Bad Code: Think like a software engineer
Ebook
Good Code, Bad Code: Think like a software engineer
byTom Long
Rating: 5 out of 5 stars
5/5
Beginning Programming For Dummies
Ebook
Beginning Programming For Dummies
byWallace Wang
Rating: 4 out of 5 stars
4/5
Engineering Management for the Rest of Us
Ebook
Engineering Management for the Rest of Us
bySarah Drasner
Rating: 5 out of 5 stars
5/5
Adobe Certified: Complete Step By Step Guide To Quickly Pass All Adobe Exams And Improve Your Job Position Real And Unique Practice Test Included
Ebook
Adobe Certified: Complete Step By Step Guide To Quickly Pass All Adobe Exams And Improve Your Job Position Real And Unique Practice Test Included
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
Modern C++ for Absolute Beginners: A Friendly Introduction to C++ Programming Language and C++11 to C++20 Standards
Ebook
Modern C++ for Absolute Beginners: A Friendly Introduction to C++ Programming Language and C++11 to C++20 Standards
bySlobodan Dmitrović
Rating: 0 out of 5 stars
0 ratings
iPhone Application Development For Dummies
Ebook
iPhone Application Development For Dummies
byNeal Goldstein
Rating: 4 out of 5 stars
4/5
DevOps For Dummies
Ebook
DevOps For Dummies
byEmily Freeman
Rating: 4 out of 5 stars
4/5
Android App Development For Dummies
Ebook
Android App Development For Dummies
byMichael Burton
Rating: 0 out of 5 stars
0 ratings
How Do I Do That in Photoshop?: The Quickest Ways to Do the Things You Want to Do, Right Now!
Ebook
How Do I Do That in Photoshop?: The Quickest Ways to Do the Things You Want to Do, Right Now!
byScott Kelby
Rating: 4 out of 5 stars
4/5
OneNote: The Ultimate Guide on How to Use Microsoft OneNote for Getting Things Done
Ebook
OneNote: The Ultimate Guide on How to Use Microsoft OneNote for Getting Things Done
byChris Will
Rating: 1 out of 5 stars
1/5

Related podcast episodes

Skip carousel

Better Performance and Enhanced Reliability in the Automotive Electronics industry: Performance and reliability are big in the automotive industry, especially now that electronically powered and automated vehicles are starting to become more popular. In this episode, we will learn so much about automotive electronics and the reliabi...
Podcast episode
Better Performance and Enhanced Reliability in the Automotive Electronics industry: Performance and reliability are big in the automotive industry, especially now that electronically powered and automated vehicles are starting to become more popular. In this episode, we will learn so much about automotive electronics and the reliabi...
byOnTrack: The PCB Design Podcast
0 ratings
0% found this document useful
Inmox: Changing the Face of Industrial Monitoring & Maintenance: Today we have a chat with Daniel Kagerbauer, CTO and co-founder at Inmox. This is a fascinating conversation regarding Inmox's aim to change how industrial maintenance is done and, while their initial focus is on gearboxes, the implications for thei...
Podcast episode
Inmox: Changing the Face of Industrial Monitoring & Maintenance: Today we have a chat with Daniel Kagerbauer, CTO and co-founder at Inmox. This is a fascinating conversation regarding Inmox's aim to change how industrial maintenance is done and, while their initial focus is on gearboxes, the implications for thei...
byOnTrack: The PCB Design Podcast
0 ratings
0% found this document useful
Data Observability
Podcast episode
Data Observability
byThe Cloudcast
0 ratings
0% found this document useful
A Field Study of Technical Debt: In this podcast, Dr. Neil Ernst discusses the findings of a recent field study to assess the state of the practice and current thinking regarding technical debt and guide the development of a technical debt timeline.
Podcast episode
A Field Study of Technical Debt: In this podcast, Dr. Neil Ernst discusses the findings of a recent field study to assess the state of the practice and current thinking regarding technical debt and guide the development of a technical debt timeline.
bySoftware Engineering Institute (SEI) Podcast Series
0 ratings
0% found this document useful
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
Podcast episode
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
byMLOps.community
0 ratings
0% found this document useful
WBSP336: Grow Your Business by Understanding Epicor Kinetic's Capabilities, an Objective Panel Discussion
Podcast episode
WBSP336: Grow Your Business by Understanding Epicor Kinetic's Capabilities, an Objective Panel Discussion
byWBSRocks: Business Growth with ERP and Digital Transformation
0 ratings
0% found this document useful
OpenTelemetry for Databases: Empowering DevOps through sqlcommenter with Nimesh Bhagat: Optimizing or debugging database calls has to become as easy as optimizing your application code based on logs, metrics or traces your observability platform provides to developers. It has to be doable by the development and DevOps teams who are...
Podcast episode
OpenTelemetry for Databases: Empowering DevOps through sqlcommenter with Nimesh Bhagat: Optimizing or debugging database calls has to become as easy as optimizing your application code based on logs, metrics or traces your observability platform provides to developers. It has to be doable by the development and DevOps teams who are...
byPurePerformance
0 ratings
0% found this document useful
From QAos to Chaos Engineering
Podcast episode
From QAos to Chaos Engineering
byThe Cloudcast
0 ratings
0% found this document useful
Machine Learning in Performance with Gopal Brugalette: Managing the performance of complex systems requires more than simply running load tests. You need to perform a careful analysis of test results and production metrics. The sheer amount of data generated makes analysis a challenge that is often left...
Podcast episode
Machine Learning in Performance with Gopal Brugalette: Managing the performance of complex systems requires more than simply running load tests. You need to perform a careful analysis of test results and production metrics. The sheer amount of data generated makes analysis a challenge that is often left...
byTestGuild Devops Toolchain Podcast
0 ratings
0% found this document useful
Automated Data Labeling for AI Apps
Podcast episode
Automated Data Labeling for AI Apps
byThe Cloudcast
0 ratings
0% found this document useful
2015 - DevOpsDays Pittsburgh - The Incredible Shrinking Operating System!: Steve Jones
Podcast episode
2015 - DevOpsDays Pittsburgh - The Incredible Shrinking Operating System!: Steve Jones
byDevOps Days Podcast
0 ratings
0% found this document useful
A Survey of Techniques for Optimizing Transformer Inference: Recent years have seen a phenomenal rise in performance and applications of transformer neural networks. The family of transformer networks, including Bidirectional Encoder Representations from Transformer (BERT), Generative Pretrained Transformer (G...
Podcast episode
A Survey of Techniques for Optimizing Transformer Inference: Recent years have seen a phenomenal rise in performance and applications of transformer neural networks. The family of transformer networks, including Bidirectional Encoder Representations from Transformer (BERT), Generative Pretrained Transformer (G...
byPapers Read on AI
0 ratings
0% found this document useful
[LIVE] Design Controls, Development & Risk for Software as a Medical Device (SaMD): In this modern digital world, did you know that most medical devices are not connected to the Internet? This episode of the Global Medical Device Podcast is a special live recording from The Greenlight Guru True Quality Roadshow in San Jose, Californi...
Podcast episode
[LIVE] Design Controls, Development & Risk for Software as a Medical Device (SaMD): In this modern digital world, did you know that most medical devices are not connected to the Internet? This episode of the Global Medical Device Podcast is a special live recording from The Greenlight Guru True Quality Roadshow in San Jose, Californi...
byGlobal Medical Device Podcast powered by Greenlight Guru
0 ratings
0% found this document useful
001 Performance 101: Key Metrics beyond Response Time and Throughput: If you are running load tests it is not enough to just look at response time and throughput. As a performance engineer you also have to look at key components that impact performance: CPU, Memory, Network and Disk Utilization should be obvious....
Podcast episode
001 Performance 101: Key Metrics beyond Response Time and Throughput: If you are running load tests it is not enough to just look at response time and throughput. As a performance engineer you also have to look at key components that impact performance: CPU, Memory, Network and Disk Utilization should be obvious....
byPurePerformance
0 ratings
0% found this document useful
Continuous Application Profiling
Podcast episode
Continuous Application Profiling
byThe Cloudcast
0 ratings
0% found this document useful
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
Podcast episode
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
byData Engineering Podcast
0 ratings
0% found this document useful
381 Programming Framework: Which Ones To Learn? - Simple Programmer Podcast: If you're a software developer I doubt you'll ever be able to learn everything that software developer has to offer. Every day new programming languages come out, technology changes and the process is updated. All this amount of information makes it...
Podcast episode
381 Programming Framework: Which Ones To Learn? - Simple Programmer Podcast: If you're a software developer I doubt you'll ever be able to learn everything that software developer has to offer. Every day new programming languages come out, technology changes and the process is updated. All this amount of information makes it...
bySimple Programmer Podcast
0 ratings
0% found this document useful
Lee McClish with NTT Global Data Center
Podcast episode
Lee McClish with NTT Global Data Center
byThe Industrial Talk Podcast with Scott MacKenzie
0 ratings
0% found this document useful
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
Podcast episode
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
byData Engineering Podcast
0 ratings
0% found this document useful
Episode 263: Encrypt That Pool | BSD Now 263: Mitigating Spectre/Meltdown on HP Proliant servers, omniOS installation setup, debugging a memory corruption issue on OpenBSD, CfT for OpenZFS native encryption, Asigra TrueNAS backup appliance shown at VMworld, NetBSD 6 EoL, and more.
Podcast episode
Episode 263: Encrypt That Pool | BSD Now 263: Mitigating Spectre/Meltdown on HP Proliant servers, omniOS installation setup, debugging a memory corruption issue on OpenBSD, CfT for OpenZFS native encryption, Asigra TrueNAS backup appliance shown at VMworld, NetBSD 6 EoL, and more.
byBSD Now
0 ratings
0% found this document useful
Understanding Time-Series Database Patterns
Podcast episode
Understanding Time-Series Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
Podcast episode
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
byData Engineering Podcast
0 ratings
0% found this document useful
Building Large AI Models
Podcast episode
Building Large AI Models
byThe Cloudcast
0 ratings
0% found this document useful
Ep. 24 - How to run a successful development process (even if you're not technical): This episode is for anyone who wants to effectively orchestrate a development process without becoming the butt of their team’s water-cooler jokes. It's more attainable than you think, because it's all about process. Don't be a Bill Lumbergh - be...
Podcast episode
Ep. 24 - How to run a successful development process (even if you're not technical): This episode is for anyone who wants to effectively orchestrate a development process without becoming the butt of their team’s water-cooler jokes. It's more attainable than you think, because it's all about process. Don't be a Bill Lumbergh - be...
byfreeCodeCamp Podcast
0 ratings
0% found this document useful
DevOps and Incident Response Evolution
Podcast episode
DevOps and Incident Response Evolution
byThe Cloudcast
0 ratings
0% found this document useful
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling: For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.
Podcast episode
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling: For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.
byData Engineering Podcast
0 ratings
0% found this document useful
Building ML Apps
Podcast episode
Building ML Apps
byThe Cloudcast
0 ratings
0% found this document useful
Ep. 33 - Code dependencies are the devil: Have you built your app on someone else's code? And beyond that, does the "secret sauce" of your product depend on external libraries or frameworks? While it's tempting to use the latest and greatest tech as soon as it comes out, that's not always a...
Podcast episode
Ep. 33 - Code dependencies are the devil: Have you built your app on someone else's code? And beyond that, does the "secret sauce" of your product depend on external libraries or frameworks? While it's tempting to use the latest and greatest tech as soon as it comes out, that's not always a...
byfreeCodeCamp Podcast
0 ratings
0% found this document useful
WBSP151: Grow Your Business Through Enterprise Integration, a Live Interview w/ a Panel of Experts
Podcast episode
WBSP151: Grow Your Business Through Enterprise Integration, a Live Interview w/ a Panel of Experts
byWBSRocks: Business Growth with ERP and Digital Transformation
0 ratings
0% found this document useful
SQL Commenter with Nimesh Bhagat and Morgan McLean: First time co-host joins this week to talk about database observability and the cool tools that make it possible. Morgan McLean and Nimesh Bhagat describe database observability, which uses metrics, logs, and other tools to help users understand the...
Podcast episode
SQL Commenter with Nimesh Bhagat and Morgan McLean: First time co-host joins this week to talk about database observability and the cool tools that make it possible. Morgan McLean and Nimesh Bhagat describe database observability, which uses metrics, logs, and other tools to help users understand the...
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful

Skip carousel

Custom Embedded Linux Images
Linux Format
Article
Custom Embedded Linux Images
Jun 4, 2019
The Yocto Project (Yocto) www.yoctoproject.org is a system that uses the Linux kernel and packages contributed from the OpenEmbedded software team. The Yocto team points out that its product is not a Linux distribution, but instead builds custom dist
8 min read
Strategic Command
Racecar Engineering
Article
Strategic Command
Nov 4, 2022
9 min read
Network-monitoring software 2024
PC Pro Magazine
Article
Network-monitoring software 2024
Feb 8, 2024
4 min read
The Current Frontier In Undustrial Manufacturing: BRINGING SOFTWARE SYSTEMS TO MARKET
The European Business Review
Article
The Current Frontier In Undustrial Manufacturing: BRINGING SOFTWARE SYSTEMS TO MARKET
Jan 31, 2020
6 min read
Best Antivirus Software For Your Windows PC
Tech Advisor
Article
Best Antivirus Software For Your Windows PC
Jan 8, 2020
14 min read
Buyer’s Guide Network Monitoring
PC Pro Magazine
Article
Buyer’s Guide Network Monitoring
Feb 9, 2023
4 min read
Intel: You Don’t Need To Disable Hyper-Threading To Protect Against The ZombieLoad Exploit
PCWorld
Article
Intel: You Don’t Need To Disable Hyper-Threading To Protect Against The ZombieLoad Exploit
Jun 4, 2019
4 min read
Inside APC
APC
Article
Inside APC
Mar 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
The New Way Your Computer Can Be Attacked
The Atlantic
Article
The New Way Your Computer Can Be Attacked
Jan 22, 2018
3 min read
Inside APC
APC
Article
Inside APC
Jun 19, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Meltdown and Spectre FAQ: How the Critical CPU Flaws Affect PCs and Macs
PCWorld
Article
Meltdown and Spectre FAQ: How the Critical CPU Flaws Affect PCs and Macs
Feb 2, 2018
7 min read
Intel’s Plan To Fix Meltdown In Silicon Raises More Questions Than Answers
PCWorld
Article
Intel’s Plan To Fix Meltdown In Silicon Raises More Questions Than Answers
Mar 6, 2018
3 min read
Identify And Solve Mac Issues
MacFormat
Article
Identify And Solve Mac Issues
Jun 1, 2021
2 min read
Are Certain AMD Motherboard Manufacturers Cheating?
APC
Article
Are Certain AMD Motherboard Manufacturers Cheating?
Jul 13, 2020
2 min read
Inside APC
APC
Article
Inside APC
Apr 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Inside APC
APC
Article
Inside APC
May 22, 2023
2 min read
Inside APC
APC
Article
Inside APC
Feb 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Inside APC
APC
Article
Inside APC
Feb 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Edge Computing Ecosystem Architecture, Use Cases, and Examples
Techfastly
Article
Edge Computing Ecosystem Architecture, Use Cases, and Examples
Jun 1, 2022
6 min read
Identify And Solve Mac Issues
MacLife
Article
Identify And Solve Mac Issues
Jul 20, 2021
2 min read
Inside APC
APC
Article
Inside APC
Jul 17, 2023
2 min read
COMPETITIVE ADVANTAGE THROUGH SOFTWARE: Contrasting Enterprises & Startups
The European Business Review
Article
COMPETITIVE ADVANTAGE THROUGH SOFTWARE: Contrasting Enterprises & Startups
Feb 4, 2019
6 min read
Meltdown, Spectre Chip Vulnerabilities: What You Need To Know
The Christian Science Monitor
Article
Meltdown, Spectre Chip Vulnerabilities: What You Need To Know
Feb 9, 2018
3 min read
Reviews
APC
Article
Reviews
Dec 4, 2023
2 min read
Is It Time For AI?
Racecar Engineering
Article
Is It Time For AI?
Jul 2, 2021
The upper echelons of motorsport are riddled with vast amounts of complex technology, both on and off the cars. Onboard contemporary high-end race machines, especially in the hybrid categories, you’ll find an array of electronic systems. And for effi
3 min read
Intel’s Plan To Fix Meltdown In Silicon Raises More Questions Than Answers
PCWorld
Article
Intel’s Plan To Fix Meltdown In Silicon Raises More Questions Than Answers
Mar 3, 2018
3 min read
03 The Magic Of Planet-scale Orchestration, The Intel Way
HWM Singapore
Article
03 The Magic Of Planet-scale Orchestration, The Intel Way
Oct 11, 2023
6 min read
Mission Center
Linux Format
Article
Mission Center
Oct 17, 2023
1 min read
Benchmark your CPU like Maximum PC
Maximum PC
Article
Benchmark your CPU like Maximum PC
May 26, 2020
6 min read
Choices, Choices
Linux Format
Article
Choices, Choices
Apr 5, 2022
Matt Yonkovit is the head of Open Source Strategy at Percona “Many modern programs are built with dozens of different open source components, constructed like LEGO from pre-built blocks. This approach to picking and choosing the best tools and compon
1 min read

Related categories

Skip carousel

Reviews for Power and Performance

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Power and Performance - Jim Kukunas

book.

Introduction

Abstract: This introductory chapter welcomes the reader into the exciting and rewarding world of performance analysis and optimization. First, the reader is guided through a short explanation of what software optimization entails. Second, some of the more significant motivations behind performance optimization are enumerated. Specifically, this section focuses on the impact of software performance on power-efficiency, including concepts such as race to idle, and user behavior. Third is a discussion of the concept of premature optimization and how the reader can differentiate appropriate and inappropriate optimizations. Finally, the reader is presented with a roadmap that outlines how the rest of the book’s content is organized.

Keywords: Software performance; Software optimization; Power efficiency; Race to idle; User behavior; Premature optimization

As engineers, we are constantly striving to perfect our craft: aspiring to architect flawless systems, diligently pursuing bug-free deliverables, and zealousl y sharpening our skillset on the proverbial whetstone of experience. As a product of these joint experiences, the relatively young discipline of software engineering continues to rapidly mature in fields such as validation and verification, software prototyping, and process modeling. However, one important aspect of software engineering is often overlooked: software optimization, that is, the craft of tuning software to efficiently utilize resources in such a way as to impact the user’s experience.

This definition yields three important insights. Firstly, because software optimization is the craft of tuning software, it is an ongoing data-driven process that requires constant feedback in the form of performance measurements and profiling. Secondly, all computing resources are limited and of varying scarcity, and thus their allocation must be carefully planned. Thirdly, optimizations must have an impact on the user experience. This impact can manifest itself in many different ways, such as improved battery life, the ability to handle increased datasets, or improved responsiveness.

Unfortunately, some developers look at performance analysis and optimization as arcane and unnecessary; a fossil of computing antiquity predating optimizing compilers and multi-core systems. So are these engineers correct in their assessment? Is performance a dated and irrelevant topic? The answer is a resounding, No! Before we dive into the methodologies and background for performance analysis, let’s first discuss why performance analysis is crucial to modern software engineering.

Performance Apologetic

There are two primary reasons why software optimization is necessary. The first reason is that software that performs well is often power-efficient software. This obviously is important to developers whose deliverables run on mobile devices, such as laptops, tablets, and phones, but it also is important to engineers of desktop and server products, whose power issues manifest themselves differently, typically in the form of thermal issues and high operating costs. For instance, data centers often, as part of their arrangements with the utility company, have minimum and maximum daily power consumption limits. This, combined with the fact that few engineers have the luxury of knowing or controlling where their users will install their software, means that every software engineer should care about writing power-efficient software. The second reason is that performance, and the related costs, have a powerful effect on usage patterns.

Performance Is Power Efficiency

It’s natural to think about power efficiency and performance as diametrically opposed concepts. For example, consumers tend to understand that while our server-class Intel® Xeon® processor line provide stellar performance, they won’t be as power-efficient in a tablet as our mobile Intel® Atom™ platform. Many users are familiar with the tunables, exposed by the drivers of hardware such as wireless and graphics cards, asking users to choose between power-efficient operation, maximum performant operation, or a balance between the two. However, when dealing with software, this distinction doesn’t exist.

In the next few chapters, we’ll dive deeper into the details of the power management features available in Intel® processors. For now, it is enough to understand that there are two general ways the CPU saves power dynamically at runtime. The first technique, known as frequency scaling, temporarily reduces the operating frequency and voltage of the processor. For example, a CPU might have a maximum operating frequency of 1.6 GHz, and the ability to scale the frequency as low as 600 MHz during light loads.

The second technique, known as deep sleep, is where the processor is halted, that is, the CPU stops executing instructions. The amount of power saved depends on how deep of a sleep the processor can enter, because in deeper sleep states various resources, such as the caches, can be powered off for additional savings. For instance, as the author sits here typing this chapter, the processor can enter a shallow sleep state in between keystrokes, waking up to process each keystroke interrupt, and can enter a deeper sleep state while the author is proofreading what he just typed.

In general, while frequency scaling can provide some power-savings, the largest savings come from the deep sleep states. Because of this, writing power-efficient software revolves around keeping the CPU asleep, and therefore not executing instructions, for as long as possible. Therefore if work needs to be done, it needs to be done as quickly as possible, so the CPU can return to its slumber. This concept is known as race to idle. Harking back to the earlier definition of software optimization where it was asserted that tuning is a data-driven process requiring constant feedback, it is illuminating to note that while race to idle is a good rule of thumb for the majority of cases, there may be cases where it provides suboptimal power-savings. This is why measurement is so important.

Both of these power-saving techniques, frequency scaling and deep sleep, are handled transparently by the operating system, so user space applications don’t need to explicitly worry about requesting them. Implicitly however, applications do need to be conscious of them, so they can align their behavior in such a way as to allow the operating system to fully utilize them. For instance, each deep sleep state has an entry and exit latency, with the deeper sleep states having higher latencies than the lighter sleep states. The operating system has to deduce how long the processor can afford to sleep before its next scheduled task, and then pick the deepest sleep state that meets that deadline.

From this it follows that the hardware can only be as power-efficient as its least efficient software component. The harsh reality is that a power-efficient hardware platform, and a finely optimized software stack are naught if the user downloads a poorly, or maliciously, written application that prevents the hardware from taking advantage of its power-saving features.

Performance and Usage Patterns

The ultimate purpose of computing is to increase the productivity of its user, and thus users, consciously or unconsciously, evaluate software features on whether the benefits to their productivity outweigh the associated costs.

To illustrate this point, consider the task of searching for an email in a mail client. Often the user is presented with two methods to complete this task; either manually searching through a list sorted by some criteria, or utilizing a keyword search capable of querying metadata. So how does a user decide which method to use for a specific query? Avoiding a discussion of human-computer interaction (HCI) and cognitive psychology, the author submits that the user compares two perceptions.

The first perception is an estimate of the duration required for performing the manual search. For instance, perhaps the emails are sorted by the date received, and the user remembers that the email was received very recently, and thus perceives a short duration, since the email should be near the top of the list. Or instead, perhaps the user was asked to scrounge up a long forgotten email from months ago, and the user has no idea where in the list this email would appear, and thus perceives a long duration. The second perception is an estimate of the duration required for utilizing the search function. This second perception might consist of prior search performance and accuracy, as well as other considerations, such as the user-interface performance and design.

The user will compare these two perceptions and choose the one with the shorter perceived duration. Therefore, as the perceived duration for the keyword search increases, its perceived value to the user approaches zero. On the other hand, a streamlined and heavily optimized search might completely displace the manual search.

This concept scales from individual software features all the way up to entire computing devices. For instance, consider a user fleetingly interested in researching a piece of trivia. If the user’s computer is powered off and takes ten minutes to boot, the user will probably not bother utilizing it for such a small benefit. On the other hand, if the user’s computer boots in 5 s, the user will be more likely to use it.

It is important to note that these decisions are based on the user’s perceptions. For instance, a mail client might have a heavily optimized search algorithm, but the user still might perceive the search as slow if the user-interface toolkit is slow to update with the query results.

A Word on Premature Optimization

So now that we’ve discussed why performance is important, let’s address a common question about when performance analysis and optimization is appropriate. If a software engineer spends enough time writing software with others, the engineer will eventually hear, or perhaps remark to a colleague, the widely misused quote from Donald Knuth, premature optimization is the root of all evil (Knuth, 1974).

In order to understand this quote, it is necessary to first establish its context. The original source is a 1974 publication entitled Structured Programming with Goto Statements from the ACM’s Computing Surveys journal. In one of the code examples, Knuth notes a performance optimization, of which he says, In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal … (Knuth, 1974). So clearly Knuth didn’t mean, as some have misconstrued him to, that all optimizations are bad, only premature optimizations. This then raises the question of what constitutes a premature optimization. Knuth continues, It is often a mistake to make a priori judgements about what parts of a program are really critical … (Knuth, 1974). In other words, a premature optimization is one made without first analyzing the program to determine what optimizations are impactful.

These words might ring truer today than they did back in 1974. Modern systems are significantly more complex, with advances in optimizing compilers, stacked caches, hardware prefetching, out-of-order execution, and countless other technological innovations widening the gap between the code we write, and how our code executes, and thus subsequently performs. As such, we as engineers must heed Knuth’s warning and ensure that our optimizations target the critical code.

Since an optimization requires analysis, does that mean that all optimization should wait until the software is completely written? Obviously not, because by that time it will be too late to undo any inefficiencies designed into the core architecture.

In the early stages of architecture and design, performance considerations should revolve around the data structures utilized and the subsequent algorithms that accompany them. This is an area where Big-O analysis is important, and there should be a general understanding of what algorithms will be computationally challenging. For those not familiar with Big-O analysis, it is a method for categorizing, and comparing, algorithms based on the order of growth of their computational costs in relation to their best, average, and worst cases of input. Early performance measurements can also be obtained from models and prototypes.

As architecture manifests as code, frequent performance profiling should occur to verify that these models are accurate. Big-O analysis alone is not sufficient. Consider the canonical case of quicksort versus mergesort and heapsort. Although mergesort and heapsort are in the worst-case superlinear, they are often outperformed by quicksort, which is in the worst-case quadratic.

The Roadmap

So hopefully, you now have a basic understanding of why performance analysis and optimization is important, and when it is appropriate. The rest of this book focuses on the how and the where. The following content is divided into four parts. The first part provides the necessary background information to get you started, focusing on Intel® Architecture and the interactions between the hardware and the Linux software stack. The second part begins by covering performance analysis methodologies. Then it provides instructions on utilizing the most popular performance profiling tools, such as Intel® VTune™ Amplifier XE and perf. The third part outlines some of the various common performance problems, identified by the tools in Part 2, and then provides details on how to correct them. Since each performance situation is different, this part is far from comprehensive, but is designed to give the reader a good starting point.

Throughout this book, AT$SPI0026SPI$T syntax is used for assembly instructions. Hexadecimal numbers are prefixed with 0x, binary numbers are written with a subscript two, such as 01012, and all other numbers are in base ten.

Reference

Knuth DE. Structured programming with go to statements. ACM Comput. Surv. 1974, dec. ;6(4):261–301. doi:10.1145/356635.356640. http://doi.acm.org/10.1145/356635.356640.

Part 1

Background Knowledge

Chapter 1

Early Intel® Architecture

Abstract

This chapter is the first of three that explores the Intel(R) Architecture, focusing on the fundamental design. The chapter begins with the Intel(R) 8086, the first x86 processor. Then the Intel(R) 8087 coprocessor is covered, along with the IEEE 754 floating point specification and the Intel(R) 80286. The chapter ends by looking at the Intel(R) 80386, the first 32-bit Intel(R) processor and the first processor to run Linux.

Keywords

8086

8087

80286

80386

x87 FPU

x86

Chapter Contents

1.1 Intel® 8086 5

1.1.1 System State 7

Address Space 7

1.1.2 Registers 8

1.1.3 Instructions 10

Data Movement 11

Integer Arithmetic 11

Boolean Logic 12

Flow Control 12

String 12

1.1.4 Machine Code Format 13

1.2 Intel® 8087 16

1.2.1 IEEE 754 Floating Point 16

Formats, Precision, and Environment 18

C99 Support 19

1.2.2 x87 Floating Point 19

1.3 Intel® 80286 and 80287 20

1.3.1 Protected and Real Mode 21

1.3.2 Protected Mode Segmentation 21

1.3.3 Task Control 22

1.4 Intel® 80386 and 80387 23

1.4.1 32-Bit Mode 24

1.4.2 Paging 26

References 28

The power button is pressed, causing the #RESET pin to be asserted, that is, the voltage is driven into the predefined threshold that represents the active state logic level. After a predefined number of clock cycles pass, the #RESET pin is then deasserted, invoking the hardware reset. The hardware reset is responsible for initializing the processor to a known state. This includes setting the registers, including the general purpose, control and status registers, to their documented initial values.

This first breath leaves the processor in an execution state significantly different from the one which most developers are familiar. The processor begins execution in 16-bit Real-Address mode. This journey between 16-bit Real-Address mode and 64-bit, or 32-bit, Protected Virtual-Address mode occurs every time the processor is powered on. In doing so, this process mirrors the historical evolution of the Intel® Architecture, from the 16-bit 8086 of 1978 to the 64-bit Intel® Core™ processor family of the twenty-first century. As the computer is booted, more than 40 years of advancements in processor technology unfold in the blink of an eye.

The author imagines that some readers are wondering what benefit could possibly come from the retention of all these relics of computing antiquity in modern processor design. Backwards compatibility, which by definition involves maintaining older designs, directly benefits the reader. As readers, hopefully, continue through this book, they will find references to many architectural features. Education on and usage of these features is an investment of the reader’s time. By strictly maintaining backward compatibility, Intel ensures that the investments made by the reader today continue to pay dividends for the foreseeable future. Consider that original bytecode compiled in the late 1970s will still run unmodified on a modern Intel® processor fabricated in 2014.

Other detractors argue that Intel Architecture is inelegant. Since elegance is in the eyes of the beholder, this claim is hard to refute. The author is not a hardware engineer; however, as a software engineer, the author sees an interesting parallel between hardware and software in this claim. Many software projects are architected to be elegant solutions to a well-defined problem. This elegance erodes over time as new features, which were unimaginable during the design phase, are requested and bug fixes and workarounds are introduced. After enduring these unavoidable harsh realities of the software lifecycle, a software’s worth comes not from elegance, but from whether that software still works, is still reliable, and can still be extended to meet future needs. The author submits that x86 still works, is still reliable, and can still be extended.

Covering every aspect of Intel Architecture would require an entire book, as opposed to a couple of chapters. The goal of the next three chapters is to introduce the fundamentals of the architecture, in order to provide just enough knowledge to enable the reader to understand the later content in this book, as well as begin to reason effectively about power consumption and performance.

Because Intel Architecture has been in a constant state of evolution for the last 40 years, and has been steadily increasing in complexity, the author has divided the introduction to Intel Architecture into three separate chronologically ordered chapters, sampling the general timeline of Intel products. Despite following the timeline, history is only being used as a convenient tool. Not all processors or architectural details will be covered in the sections that follow. Also, unless explicitly stated, not all technologies are discussed in the section corresponding to the processor that introduced them. Instead, each section will utilize the given processor as a vehicle for simplifying and explaining important concepts of the modern x86 architecture. This has the dual benefit of gradually introducing the concepts of the modern x86 architecture in a natural and logical progression, while also providing the context necessary to understand the reasoning behind the technical decisions made through the years.

This chapter examines some of the earliest members of the x86 family. The majority of these processors are 16-bit. The chapter ends with the first 32-bit x86 processor, and the first processor to run Linux, the Intel® 80386.

1.1 Intel® 8086

The 8086, introduced in 1978 and shown in Figure 1.1, was not the first processor developed by Intel, however it was the first processor to implement what would be known as the x86 architecture. In fact, the two digits in the name x86 correspond to the fact that the first x86 processor designations all ended with 86, such as 8086, 80186, 80286, 80386, and so on. This trend was continued by Intel until the Pentium® processor family. Despite being introduced in 1978, the 8086 was continually produced throughout the 1980s and 1990s. Famously, NASA had been using the 8086 in their shuttles until 2011.

Figure 1.1 Intel® 8086 ( Lanzet, 2009 ).

In stark contrast to the thousands of pages in the modern Intel® Software Developer Manual (SDM), the 8086 user manual, including a full reference on the system behavior, programming tools, hardware pinout, instructions and opcodes, and so on, fits within about two hundred pages. This simplified nature makes the 8086 an excellent introduction to the fundamental concepts of the x86 architecture. The rest of this book will build further on the concepts introduced here.

In the architecture of the 8086, there are three fundamental design principles (Intel Corporation, 1979). The first principle is the usage of specialized components to isolate and distribute functionality. In other words, rather than one large monolithic system architecture, building an 8086 system provided flexibility in the peripherals selected. This allowed system builders to customize the system to best accommodate the needs of the given situation. The second principle is the inherent support for parallelism within the architecture. At the time of the 8086, this parallelism came in the form of combining multiple processors and coprocessors together, in order to perform multiple tasks in parallel. The third principle was the hierarchical bus organization, designed to support both complex and simple data flows.

To illustrate these principles, consider that along with the 8086, there were two other Intel processors available, the 8088 and 8089. The 8088 was an identical processor to the 8086, except that the external data bus was 8 bits, instead of 16 bits. The 8088 was chosen by IBM® for the heart of the IBM compatible PC. The 8089 was a separate coprocessor, with it’s own instruction set and assembler, designed for offloading data transfers, such as Direct Memory Access (DMA) requests. Interestingly, the 8089 was not chosen by IBM® for the IBM compatible PC, and therefore was never widely adopted.

One or more of each of these chips could be mixed and matched, along with a selection of peripherals, to provide systems highly tailored for software requirements. For example, consider the system diagram in Figure 1.2. This diagram, taken from the 8086 User Manual, demonstrates a multiprocessor configuration consisting of one primary 8086 processor, along with three 8088 processors, driving three graphics peripherals, and five 8089 processors, with three accelerating I/O to the graphics peripherals and two accelerating access to a database.

Figure 1.2 Example Multiprocessor Intel® 8086 system diagram ( Intel Corporation, 1979 ).

This modularity can be seen throughout the design of the 8086. For example, the internal architecture of the 8086 is cleanly divided into two categories: the execution unit (EU), and the bus interface unit (BIU). The EU contains all of the resources for maintaining and manipulating the internal processor state, including the CPU registers and arithmetic/logic unit (ALU). The EU, however, is unable to access the system’s memory, which requires usage of the system bus. The BIU is solely responsible for interfacing with the system bus. This separation allows for the two units to operate in parallel, with the BIU fetching data over the bus while the EU is executing instructions.

To accommodate this asynchronous nature, the two units interface with an instruction queue, therefore allowing the BIU to prefetch instructions before they are required. When the BIU detects that there is an available slot in the queue, it begins to fetch an instruction to fill that slot. When the EU detects that there is a valid entry in the queue, it removes that entry and begins executing it. If the EU requires the evaluation of a memory operand, it passes the effective address to the BIU, which suspends instruction prefetching and retrieves the desired system information.

Beginning to understand the x86 architecture starts with the realization that a processor is simply a state machine with states, described by the contents of memory and the registers, and inputs that transition between different states, the instruction set.

1.1.1 System State

For the general case, there are three members of the 8086 storage hierarchy: the processor’s registers, the system’s external memory, that is, RAM, and persistent storage, that is, the disk drive. The registers provide fast volatile storage located very close to the processor on the die. Since this memory is very fast, it is also very expensive, and therefore very scarce. On the other hand, the volatile external memory is significantly cheaper than the registers, and therefore more prevalent than the registers, but is also orders of magnitude slower. Similarly, the persistent storage is significantly cheaper than the external memory, and therefore is available in greater quantities, but is also orders of magnitude slower than external memory. To quantify the inverse relationship between storage density and speed, consider that the 8086 might be paired with a 10 MB hard drive, was capable of addressing 1 MB of external RAM, and had seven 16-bit registers.

Address space

The 8086 is comprised of two different address spaces, the memory address space, capable of addressing 1 MB, and the I/O address space, capable of addressing 64 KB.

External peripherals can either be mapped into the I/O address space or memory-mapped. Device registers in the I/O address space are manipulated with the IN and OUT instructions, while memory-mapped device registers can be manipulated with general purpose instructions, just like external

Enjoying the preview?

Page 1 of 1

Power and Performance: Software Analysis and Optimization

About this ebook

Jim Kukunas

Related authors

Related to Power and Performance

Related ebooks

Software Development & Engineering For You

Related podcast episodes

Related articles

Related categories

Reviews for Power and Performance

What did you think?

Book preview

Power and Performance - Jim Kukunas

Introduction

Performance Apologetic

Performance Is Power Efficiency

Performance and Usage Patterns

A Word on Premature Optimization

The Roadmap

Reference

Abstract

Keywords

Chapter Contents

1.1 Intel® 8086

1.1.1 System State

Address space