Dynamic Learning is an online subscription solution that supports teachers and students with high quality content and un
Views 3,023 Downloads 561 File size 16MB
Report DMCA / Copyright
DOWNLOAD FILE
Recommend Stories
Citation preview
Dynamic Learning is an online subscription solution that supports teachers and students with high quality content and unique tools. Dynamic Learning incorporates elements that all work together to give you the ultimate classroom and homework resource. Teaching and Learning titles include interactive resources, lesson planning tools, self-marking tests and assessment. Teachers can: ●● Use the Lesson Builder to plan and deliver lessons ●● Share lessons and resources with students and colleagues ●● Track student’s progress Teachers can also combine their own trusted resources alongside those from Cambridge International AS & A Level Computer Science Online Teacher’s Guide which has a whole host of informative and interactive resources including: ●● Teaching notes and guidance ●● Schemes of work ●● Extra activities and exam-style questions ●● Answers to questions in the Student’s Book Cambridge International AS & A Level Computer Science is available as a Whiteboard eTextbook which is an online interactive version of the printed textbook that enables teachers to: ●● Display interactive pages to their class ●● Add notes and highlight areas ●● Add double-page spreads into lesson plans Additionally the Student eTextbook of Cambridge International AS & A Level Computer Science is a downloadable version of the printed textbook that teachers can assign to students so they can: ●● Download and view on any device or browser ●● Add, edit and synchronise notes across two devices ●● Access their personal copy on the move To find out more and sign up for free trials visit: www.hoddereducation.com/dynamiclearning
Cambridge International AS & A Level
Computer Science
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 1
4/30/19 7:42 AM
This page intentionally left blank
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 2
4/30/19 7:42 AM
Cambridge International AS & A Level
Computer Science David Watson Helen Williams
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 3
4/30/19 7:42 AM
Unless otherwise acknowledged, the questions, example answers and comments that appear in this book were written by the authors. In examinations, the way marks are awarded may be different. Questions from the Cambridge International AS & A Level Computer Science papers are reproduced by permission of Cambridge Assessment International Education. Cambridge Assessment International Education bears no responsibility for the example answers to questions taken from its past question papers which are contained in this publication. The publishers would like to thank the following who have given permission to reproduce the following material in this book: Page 181 Extract from IEEE Code of Ethics. Reprinted with permission of IEEE with the copyright notice ©Copyright 2018 IEEE; Pages 181–3 Copyright © 1999 by the Institute for Electrical and Electronics Engineers, Inc. and the Association for Computing Machinery, Inc.; Page 187 eBay software pirates stump up $100,000 – https://www. theregister.co.uk/2006/11/24/ebay_pirates_payup/. Reprinted with permission of Out-Law.com, the news service of international law firm Pinsent Masons; Page 218 Map data © 2018 Google, Imagery © 2018 Landsat/Copernicus. Photo credits Figures 1.1 and 1.2 © David Watson; Figure 1.3 ©Sébastien Delaunay/stock.adobe.com; Figure 2.18 ©Forgem/ Shutterstock.com; Figure 3.1 tl ©studio306fotolia/stock.adobe.com; tr ©Chavim/stock.adobe.com; bl ©pozdeevvs/ stock.adobe.com; br © Sergey Yarochkin/stock.adobe.com; Figure 3.4 ©Mau Horng/stock.adobe.com; Figure 3.5 ©science photo/stock.adobe.com; Figure 3.9 ©Maksym Dykha/Shutterstock.com; Figure 3.10 ©Hurst Photo/ Shutterstock.com; Figure 3.11 ©philipus/stock.adobe.com; Figure 3.12 ©belekekin/ Shutterstock.com; Figure 4.4 l ©cybertrone/stock.adobe.com, c ©Tungphoto/Shutterstock.com, r ©Luminis/Shutterstock.com; Figure 5.1 l ©Stuart Brady (Public Domain) via Wikipedia Commons; r ©Jiri Hera/stock.adobe.com; Figure 6.3 ©Andrey Burmakin/stock.adobe.com; Figure 6.4 ©bkilzer/stock.adobe.com; Figure 7.2 l ©Pres Panayotov/ Shutterstock.com; c ©James Balog/Getty Images; r ©caluian/stock.adobe.com; Figure 18.19 ©seewhatmitchsee/ 123rf.com; Figure 18.21 b ©Garmon/stock.adobe.com; t ©Christian Musat/stock.adobe.com; ct ©Ammit/stock. adobe.com; cb ©Martina Berg/stock.adobe.com; Figure 18.24 Harshal/stock.adobe.com; Figures 18.27 and 18.28 all © David Watson. l = left, c = centre, b = bottom, t = top, r = right Every effort has been made to trace and acknowledge ownership of copyright. The publishers will be glad to make suitable arrangements with any copyright holders whom it has not been possible to contact. Computer hardware and software brand names mentioned in this book are protected by their respective trademarks and are acknowledged. Although every effort has been made to ensure that website addresses are correct at time of going to press, Hodder Education cannot be held responsible for the content of any website mentioned in this book. Hachette UK’s policy is to use papers that are natural, renewable and recyclable products and made from wood grown in well-managed forests and other controlled sources. The logging and manufacturing processes are expected to conform to the environmental regulations of the country of origin. Orders: please contact Bookpoint Ltd, 130 Park Drive, Milton Park, Abingdon, Oxon OX14 4SE. Telephone: (44) 01235 827827. Fax: (44) 01235 400401. Email [emailprotected] Lines are open from 9 a.m. to 5 p.m., Monday to Saturday, with a 24-hour message answering service. You can also order through our website: www.hoddereducation.com © David Watson and Helen Williams 2019 First published 2019 by Hodder Education, An Hachette UK Company Carmelite House 50 Victoria Embankment London EC4Y 0DZ www.hoddereducation.com Impression number 10 9 8 7 6 5 4 3 2 1 Year
2023 2022 2021 2020 2019
All rights reserved. Apart from any use permitted under UK copyright law, no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or held within any information storage and retrieval system, without permission in writing from the publisher or under licence from the Copyright Licensing Agency Limited. Further details of such licences (for reprographic reproduction) may be obtained from the Copyright Licensing Agency Limited, www.cla.co.uk Cover photo © Terrance Emerson – stock.adobe.com Illustrations by Aptara Inc. and Hodder Education Typeset by Aptara Inc. Printed by Bell & Bain Ltd, Glasgow A catalogue record for this title is available from the British Library. ISBN: 9781510457591
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 4
4/30/19 7:42 AM
Contents Introduction viii
AS LEVEL 1 Information representation and multimedia 1.1 Data representation 1.2 Multimedia 1.3 File compression
1 2 15 21
2 Communication
27
2.1 Networking 2.2 The internet
28 54
3 Hardware 3.1 3.2
68
Computers and their components Logic gates and logic circuits
68 89
4 Processor fundamentals
107
4.1 4.2 4.3
Central processing unit (CPU) architecture Assembly language Bit manipulation
5 System software 5.1 5.2
Operating systems Language translators
6 Security, privacy and data integrity 6.1 6.2
Data security Data integrity
7 Ethics and ownership 7.1 7.2 7.3
Legal, moral, ethical and cultural implications Copyright issues Artificial intelligence (AI)
8 Databases 8.1 Database concepts 8.2 Database management systems (DBMSs) 8.3 Data definition language (DDL) and data manipulation language (DML)
107 121 130
136 136 149
159 159 169
178 179 186 189
196 196 208 211 v
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 5
4/30/19 7:42 AM
Contents
9 Algorithm design and problem solving 9.1 Computational thinking skills 9.2 Algorithms
10 Data types and structures 10.1 Data types and records 10.2 Arrays 10.3 Files 10.4 Abstract data types (ADTs)
11 Programming 11.1 Programming basics 11.2 Programming constructs 11.3 Structured programming
12 Software development 12.1 Program development lifecycle 12.2 Program design 12.3 Program testing and maintenance
217 217 219
238 238 241 249 250
264 264 271 275
283 283 287 293
A LEVEL 13 Data representation 13.1 User-defined data types 13.2 File organisation and access 13.3 Floating-point numbers, representation and manipulation
14 Communication and internet technologies 14.1 Protocols 14.2 Circuit switching and packet switching
15 Hardware 15.1 Processors and parallel processing 15.2 Boolean algebra and logic circuits
16 System software and virtual machines 16.1 Purposes of an operating system (OS) 16.2 Virtual machines (VMs) 16.3 Translation software
304 304 308 312
328 328 337
346 346 354
372 372 392 394
vi
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 6
4/30/19 7:42 AM
17.1 Encryption 17.2 Quantum cryptography 17.3 Protocols 17.4 Digital signatures and digital certificates
18 Artificial intelligence (AI) 18.1 Shortest path algorithms 18.2 Artificial intelligence, machine learning and deep learning
19 Computational thinking and problem solving 19.1 Algorithms 19.2 Recursion
20 Further programming 20.1 Programming paradigms 20.2 File processing and exception handling
410 410 414 416 418
Contents
17 Security
425 425 434
450 450 490
498 498 525
Glossary 541 Index 553
vii
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 7
4/30/19 7:42 AM
INTRODUCTION
Introduction This textbook provides the knowledge, understanding and practical skills to support those studying Cambridge International AS & A Level Computer Science. This textbook is part of a suite of resources which include a Programming Skills Workbook and an Online Teacher’s Guide. The syllabus content has been covered comprehensively and is presented in two sections: Chapters 1 to 12 cover the AS Level, Chapters 13 to 20 cover the extra content required for the full A Level.
How to use this book To make your study of Computer Science as rewarding and successful as possible, this textbook, endorsed by Cambridge Assessment International Education, offers the following important features.
Organisation The content is presented in the same order as in the syllabus, and the chapter titles match those in the syllabus.
Features to help you learn Each chapter is broken down into several sections, so that the content is accessible. At the start of each chapter, there is a blue box that gives a summary of the syllabus points to be covered in that chapter, to show you what you are going to learn. In this chapter, you will learn about ★ ★ ★ ★ ★ ★
binary magnitudes, binary prefixes and decimal prefixes binary, denary and hexadecimal number systems how to carry out binary addition and subtraction the use of hexadecimal and binary coded decimal (BCD) number systems the representation of character sets (such as ASCII and Unicode) how data for a bit-mapped image is encoded
★ ★ ★ ★ ★
how to estimate the file size for a bit-map image image resolution and colour depth encoding of vector graphics the representation of sound in a computer the effects of changing sampling rate and resolution on sound quality ★ the need for file compression methods (such as lossy and lossless formats) ★ how to compress common file formats (such as text files, bit-map images, vector graphics, sound files and video files).
The grey-blue What you should already know boxes at the beginning of each chapter or section help you to check you have the right level of knowledge before you begin. You may have already studied Computer Science at IGCSE, O Level or equivalent, or you may not have. These boxes contain questions to find out how much you remember, or to gauge your previous learning. If you are unable to answer the questions, you will need to refresh your memory, or make sure you are familiar with the the relevant ideas, before continuing.
viii
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 8
4/30/19 7:42 AM
INTRODUCTION
WHAT YOU SHOULD ALREADY KNOW 3 What are the column weightings for the hexadecimal (base 16) number system? 4 Carry out these hexadecimal additions. Convert your answers to denary. a) 1 0 7 + 2 5 7 f) 1 2 5 1 + 2 5 6 7 b) 2 0 8 + A 1 7 g) 3 4 A B + C 0 0 A c) A A A + 7 7 7 h) A 0 0 1 + D 7 7 F d) 1 F F + 7 F 7 i) 1 0 0 9 + 9 F F 1 e) 1 4 9 + F 0 F j) 2 7 7 7 + A C F 1
Try these four questions before you read this chapter. 1 What are the column weightings for the binary number system? 2 Carry out these binary additions. Convert your answers to denary. a) 0 0 1 1 0 1 0 1 + 0 1 0 0 1 0 0 0 b) 0 1 0 0 1 1 0 1 + 0 1 1 0 1 1 1 0 c) 0 1 0 1 1 1 1 1 + 0 0 0 1 1 1 1 0 d) 0 1 0 0 0 1 1 1 + 0 1 1 0 1 1 1 1 e) 1 0 0 0 0 0 0 1 + 0 1 1 1 0 1 1 1 f) 1 0 1 0 1 0 1 0 + 1 0 1 0 1 0 1 0
Key terms for each chapter or section are listed, with definitions. When you are reading through the chapter and you come across a term you don’t understand, go back and see if it has been explained here.
Key terms Logic gates – electronic circuits which rely on ‘on/off’ logic; the most common ones are NOT, AND, OR, NAND, NOR and XOR. Logic circuit – formed from a combination of logic gates and designed to carry out a particular task; the output from a logic circuit will be 0 or 1. Truth table – a method of checking the output from a logic circuit; they use all
the possible binary input combinations depending on the number of inputs; for example, two inputs have 22 (4) possible binary combinations, three inputs will have 2 3 (8) possible binary combinations, and so on. Boolean algebra – a form of algebra linked to logic circuits and based on TRUE and FALSE.
There are Activities throughout, so that you can apply what you have learned. Some of these take the form of questions, to allow you to test your knowledge; others aim to give you experience of practical work. Some of these will also give you opportunities to work collaboratively with other students.
ACTIVITY 3B Produce truth tables for each of the following logic circuits. You are advised to split them up into intermediate parts to help eliminate errors. a) A
b) A B
c)
A
X
X C
B
B
d) A B
e) A B X
C
X
X C
ix
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 9
4/30/19 7:42 AM
INTRODUCTION
There are also some Extension activities. These go beyond the requirements of the syllabus, but it is good to see if you know the answers. We hope they will be of interest to you.
EXTENSION ACTIVITY 3E 1 Look at this simplified diagram of a keyboard; the letter H has been pressed. Explain: a) how pressing the letter H has been recognised by the computer b) how the computer manages the very slow process of inputting data from a keyboard. 2 a) Describe how these types of pointing devices work. i) Mechanical mouse ii) Optical mouse b) Connectivity between mouse and computer can be through USB cable or wireless. Explain these two types of connectivity. G
letter H has been pressed and now makes contact with bottom conductive layer
J H
conductive layers letter H interpreted by computer
insulating layer
The End of chapter questions are practice exam-style questions; these provide a more formal way to check your progress. Some questions from Cambridge International AS & A Level Computer Science past papers are included.
End of chapter questions
1 a) The following bytes represent binary integers using the two’s complement form. State the equivalent denary values. i) 0 1 0 0 1 1 1 1 [1] ii) 1 0 0 1 1 0 1 0 [1] iii) Write the integer −53 in two’s complement form. [1] iv) Write the maximum possible range of numbers using the two’s complement form of an 8-bit binary number. Give your answers in denary. [2] b) i) Write the denary integer 798 in binary-coded decimal (BCD) format.[1] ii) Write the denary number that is represented by the following BCD number. 1
1
1
1
1
1
1
1
1
1
[2] c) Give one use of binary-coded decimal system.
[1]
x
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 10
4/30/19 7:42 AM
If you are following the AS Level course, you will take two examination papers: » Paper 1 Theory Fundamentals (1 hour 30 minutes) » Paper 2 Fundamental Problem-solving and Programming Skills (2 hours)
INTRODUCTION
Assessment
If you are studying the A Level course, you will take four examination papers, Papers 1 and 2 and also: » Paper 3 Advanced Theory (1 hour 30 minutes) » Paper 4 Practical (2 hours 30 minutes)
Note that calculators must not be used in any paper.
Command words The table below includes command words used in the assessment for this syllabus. The use of the command word will relate to the subject context. Make sure you are familiar with these. Command word
What it means
Analyse
examine in detail to show meaning, identify elements and the relationship between them
Assess
make an informed judgement
Calculate
work out from given facts, figures or information
Comment
give an informed opinion
Compare
identify/comment on similarities and/or differences
Complete
add information to an incomplete diagram or table
Consider
review and respond to given information
Contrast
identify/comment on differences
Define
give precise meaning
Demonstrate
show how or give an example
Describe
state the points of a topic/give characteristics and main features
Develop
take forward to a more advanced stage or build upon given information
Discuss
write about issue(s) or topic(s) in depth in a structured way
Draw
draw a line to match a term with a description
Evaluate
judge or calculate the quality, importance, amount, or value of something
Examine
investigate closely, in detail
Explain
set out purposes or reasons/make the relationships between things evident/provide why and/or how and support with relevant evidence
Give
produce an answer from a given source or recall/memory
Identify
name/select/recognise
Justify
support a case with evidence/argument
Outline
set out main points
Predict
suggest what may happen based on available information
Sketch
make a simple freehand drawing showing the key features, taking care over proportions
➔ xi
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 11
4/30/19 7:42 AM
INTRODUCTION
Command word
What it means
State
express in clear terms
Suggest
apply knowledge and understanding to situations where there are a range of valid responses in order to make proposals
Summarise
select and present the main points, without detail
Write
write an answer in a specific way
From the authors We hope you enjoy this book. It encourages you to develop your computational thinking while broadening your understanding of computer science. This should prove helpful when you go on to further study, where topics such as artificial intelligence, quantum cryptography and imperative and declarative programming will be studied; all of these are covered in the later chapters of the book. In order to handle such topics confidently, you will need to be a competent programmer who uses computational thinking to solve problems and has a good understanding of computer architecture. All chapters are designed to build on your previous experience in a way that develops essential skills and at the same time expands the techniques you are able to use. David Watson Helen Williams
Notes for teachers Key concepts These are the essential ideas that help learners to develop a deep understanding of the subject and to make links between the different topics. Although teachers are likely to have these in mind at all times when they are teaching the syllabus, the following icons are included in the textbook at points where the key concepts relate to the text:
Computational thinking Computational thinking is a set of fundamental skills that help produce a solution to a problem. Skills such as abstraction, decomposition and algorithmic thinking are used to study a problem and design a solution that can be implemented. This may involve using a range of technologies and programming languages. Programming paradigms A programming paradigm is a way of thinking about or approaching problems. There are many different programming styles that can be used, which are suited to unique functions, tools and specific situations. An understanding of programming paradigms is essential to ensure they are used appropriately, when designing and building programs.
xii
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 12
4/30/19 7:42 AM
INTRODUCTION
Communication Communication is a core requirement of computer systems. It includes the ability to transfer data from one device or component to another and an understanding of the rules and methods that are used in this data transfer. Communication could range from the internal transfer of data within a computer system, to the transfer of a video across the internet. Computer architecture and hardware Computer architecture is the design of the internal operation of a computer system. It includes the rules that dictate how components and data are organised, how data are communicated between components, to allow hardware to function. There is a range of architectures – with different components and rules – that are appropriate for different scenarios. All computers comprise a combination of hardware components, ranging from internal components, such as the central processing unit (CPU) and main memory, to peripherals. To produce effective and efficient programs to run on hardware, it is important to understand how the components work independently and together to produce a system that can be used. Hardware needs software to be able to perform a task. Software allows hardware to become functional. This enables the user to communicate with the hardware to perform tasks. Data representation and structures Computers use binary and understanding how a binary number can be interpreted in many different ways is important. Programming requires an understanding of how data can be organised for efficient access and/or transfer.
Additional support The Programming Skills Workbook provides practice for the programming papers and includes exercises designed to give students the necessary experience of working in one of the three prescribed high-level programming languages: Java (Console mode), Visual Basic and Python (Console mode). It is a write-in workbook designed to be used throughout the course. Answers to questions are available in the Online Teacher’s Guide.
xiii
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 13
4/30/19 7:42 AM
This page intentionally left blank
457591_FM_CI_AS & A_Level_CS_i-xiv.indd 14
4/30/19 7:42 AM
1
Information representation and multimedia 1.1 Data representation
In this chapter, you will learn about ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
binary magnitudes, binary prefixes and decimal prefixes binary, denary and hexadecimal number systems how to carry out binary addition and subtraction the use of hexadecimal and binary coded decimal (BCD) number systems the representation of character sets (such as ASCII and Unicode) how data for a bit-mapped image is encoded how to estimate the file size for a bit-map image image resolution and colour depth encoding of vector graphics the representation of sound in a computer the effects of changing sampling rate and resolution on sound quality the need for file compression methods (such as lossy and lossless formats) how to compress common file formats (such as text files, bit-map images, vector graphics, sound files and video files).
WHAT YOU SHOULD ALREADY KNOW Try these four questions before you read this chapter. 1 What are the column weightings for the binary number system? 2 Carry out these binary additions. Convert your answers to denary. a) 0 0 1 1 0 1 0 1 + 0 1 0 0 1 0 0 0 b) 0 1 0 0 1 1 0 1 + 0 1 1 0 1 1 1 0 c) 0 1 0 1 1 1 1 1 + 0 0 0 1 1 1 1 0 d) 0 1 0 0 0 1 1 1 + 0 1 1 0 1 1 1 1 e) 1 0 0 0 0 0 0 1 + 0 1 1 1 0 1 1 1 f) 1 0 1 0 1 0 1 0 + 1 0 1 0 1 0 1 0 3 What are the column weightings for the hexadecimal (base 16) number system?
4 Carry out these hexadecimal additions. Convert your answers to denary. a) 1 0 7 + 2 5 7 b) 2 0 8 + A 1 7 c) A A A + 7 7 7 d) 1 F F + 7 F 7 e) 1 4 9 + F 0 F f) 1 2 5 1 + 2 5 6 7 g) 3 4 A B + C 0 0 A h) A 0 0 1 + D 7 7 F i) 1 0 0 9 + 9 F F 1 j) 2 7 7 7 + A C F 1
1
457591_01_CI_AS & A_Level_CS_001-026.indd 1
25/04/19 9:11 AM
1
1.1 Data representation Key terms
1 Information representation and multimedia
Binary – base two number system based on the values 0 and 1 only. Bit – abbreviation for binary digit. One’s complement – each binary digit in a number is reversed to allow both negative and positive numbers to be represented. Two’s complement – each binary digit is reversed and 1 is added in right-most position to produce another method of representing positive and negative numbers. Sign and magnitude – binary number system where left-most bit is used to represent the sign (0 = + and 1 = –); the remaining bits represent the binary value. Hexadecimal – a number system based on the value 16 (uses the denary digits 0 to 9 and the letters A to F).
Memory dump – contents of a computer memory output to screen or printer. Binary-coded decimal (BCD) – number system that uses 4 bits to represent each denary digit. ASCII code – coding system for all the characters on a keyboard and control codes. Character set – a list of characters that have been defined by computer hardware and software. It is necessary to have a method of coding, so that the computer can understand human characters. Unicode – coding system which represents all the languages of the world (first 128 characters are the same as ASCII code).
1.1.1 Number systems Every one of us is used to the decimal or denary (base 10) number system. This uses the digits 0 to 9 which are placed in ‘weighted’ columns. 10 000 3
1000 1
100 4
10 2
units 1
The denary number represented above is thirty-one thousand, four hundred and twenty-one. (Note that dealing with decimal fractions is covered in Chapter 13 since this is slightly more complex.) Designers of computer systems adopted the binary (base 2) number system since this allows only two values, 0 and 1. No matter how complex the system, the basic building block in all computers is the binary number system. Since computers contain millions and millions of tiny ‘switches’, which must be in the ON or OFF position, this lends itself logically to the binary system. A switch in the ON position can be represented by 1; a switch in the OFF position can be represented by 0. Each of the binary digits are known as bits.
1.1.2 Binary number system The binary system uses 1s and 0s only which gives these corresponding weightings: 128 (27)
64 (26)
32 (25)
16 (24)
8 (23)
4 (22)
2 (21)
1 (20)
1
1
A typical binary number would be: 1
1
1
1
Converting from binary to denary and from denary to binary It is fairly straightforward to change a binary number into a denary number. Each time a 1 appears in a column, the column value is added to the total. For example, the binary number above is: 128 + 64 + 32 + 8 + 4 + 2 = 238 (denary) 2
457591_01_CI_AS & A_Level_CS_001-026.indd 2
25/04/19 9:11 AM
ACTIVITY 1A
ACTIVITY 1B Convert these denary numbers into binary (using either method). a) 4 1 b) 6 7 c) 8 6 d) 1 0 0 e) 1 1 1 f) 1 2 7 g) 1 4 4 h) 1 8 9 i) 2 0 0 j) 2 5 5
The reverse operation – converting from denary to binary – is slightly more complex. There are two basic ways of doing this.
1
Consider the conversion of the denary number, 107, into binary … Method 1 This method involves placing 1s in the appropriate position so that the total equates to 107. 128 0
64 1
32 1
16 0
8 1
4 0
2 1
1.1 Data representation
Convert these binary numbers into denary. a) 0 0 1 1 0 0 1 1 b) 0 1 1 1 1 1 1 1 c) 1 0 0 1 1 0 0 1 d) 0 1 1 1 0 1 0 0 e) 1 1 1 1 1 1 1 1 f) 0 0 0 0 1 1 1 1 g) 1 0 0 0 1 1 1 1 h) 0 0 1 1 0 0 1 1 i) 0 1 1 1 0 0 0 0 j) 1 1 1 0 1 1 1 0
The 0 values are simply ignored when calculating the total.
1 1
Method 2 This method involves successive division by 2; the remainders are then written from bottom to top to give the binary value. 2
107
2
53
remainder: 1
2
26
remainder: 1
2
13
remainder: 0
2
6
remainder: 1
2
3
remainder: 0
2
1
remainder: 1
2
remainder: 1
remainder: 0
Write the remainder from bottom to top to get the binary number: 01101011
Binary addition and subtraction Up until now we have assumed all binary numbers have positive values. There are a number of methods to represent both positive and negative numbers. We will consider: » one’s complement » two’s complement. In one’s complement, each digit in the binary number is inverted (in other words, 0 becomes 1 and 1 becomes 0). For example, 0 1 0 1 1 0 1 0 (denary value 90) becomes 1 0 1 0 0 1 0 1 (denary value −90). In two’s complement, each digit in the binary number is inverted and a ‘1’ is added to the right-most bit. For example, 0 1 0 1 1 0 1 0 (denary value 90) becomes: 1 0 1 0 0 1 0 1 + 1 = 1 0 1 0 0 1 1 0 (since 1 + 1 = 0, a carry of 1) = denary value −90 Throughout the remainder of this chapter, we will use the two’s complement method to avoid confusion. Also, two’s complement makes binary addition and subtraction more straightforward. The reader is left to investigate one’s complement and the sign and magnitude method in binary arithmetic. Now that we are introducing negative numbers, we need a way to represent these in binary. The two’s complement uses these weightings for an 8-bit number representation: −128
64
32
16
8
4
2
1 3
457591_01_CI_AS & A_Level_CS_001-026.indd 3
25/04/19 9:11 AM
1
EXTENSION ACTIVITY 1A
1 Information representation and multimedia
Show the column headings for a system that uses 16bits to represent a binary number.
This means: −128 1 0
64 1 0
32 0 1
16 1 0
8 1 0
4 0 1
2 1 1
1 0 0
The first example is: −128 + 64 + 16 + 8 + 2 = −38 The second example is: 32 + 4 + 2 = 38 The easiest way to convert a number into its negative equivalent is to use two’s complement. For example, 104 in binary is 0 1 1 0 1 0 0 0. To find the binary value for −104 using two’s complement: invert the digits: add 1: which gives:
1
1
1
1
1
1
1
1 1 0
(+104 in denary) = −104)
ACTIVITY 1C Convert these denary numbers into 8-bit binary numbers using two’s complement where necessary. Use these binary column weightings:
−128
64
32
16
8
4
2
1
a) +114 b) +61 c) +96 d) −14 e) −116
Binary addition Consider Examples 1.1 and 1.2.
Example 1.1
Add 0 0 1 0 0 1 0 1 (37 in denary) and 0 0 1 1 1 0 1 0 (58 in denary).
Solution −128 0
64 0
32 1
16 0
8 0
4 1
2 0
1 1
1
1
1
1
1
1
+ 0
1
1 =
1
1
This gives us 0 1 0 1 1 1 1 1, which is 95 in denary; the correct answer.
4
457591_01_CI_AS & A_Level_CS_001-026.indd 4
25/04/19 9:11 AM
Example 1.2
Add 0 1 0 1 0 0 1 0 (82 in denary) and 0 1 0 0 0 1 0 1 (69 in denary).
1
Solution −128 0
64 1
32 0
16 1
8 0
4 0
2 1
1 0
1
1
1
1
1
+ 1
1.1 Data representation
= 1
1
This gives us 1 0 0 1 0 1 1 1, which is –105 in denary (which is clearly nonsense). When adding two positive numbers, the result should always be positive (likewise, when adding two negative numbers, the result should always be negative). Here, the addition of two positive numbers has resulted in a negative answer. This is due to the result of the addition producing a number which is outside the range of values which can be represented by the 8 bits being used (in this case +127 is the largest value which can be represented, and the calculation produces the value 151, which is larger than 127 and, therefore, out of range). This causes overf low; it is considered in more detail in Chapter 13.
Binary subtraction To carry out subtraction in binary, we convert the number being subtracted into its negative equivalent using two’s complement, and then add the two numbers.
Example 1.3
Carry out the subtraction 95 – 68 in binary.
Solution 1 Convert the two numbers into binary:
95 = 0 1 0 1 1 1 1 1 68 = 0 1 0 0 0 1 0 0 2 Find the two’s complement of 68: invert the digits:
1
1
1
1
1
1
add 1:
1
which gives:
1
1
1
1
1
= −68
3 Add 95 and −68:
−128
64
32
16
8
4
2
1
1
1
1
1
1
1
1
1
1
1
1
+ 1
1
1 =
1
1
The additional ninth bit is simply ignored leaving the binary number 0 0 0 1 1 0 1 1 (denary equivalent of 27, which is the correct result of the subtraction). 5
457591_01_CI_AS & A_Level_CS_001-026.indd 5
25/04/19 9:11 AM
1
Example 1.4
Carry out the subtraction 49 – 80 in binary.
Solution 1 Convert the two numbers into binary:
49 = 0 0 1 1 0 0 0 1 80 = 0 1 0 1 0 0 0 0 2 Find the two’s complement of 80:
1 Information representation and multimedia
invert the digits:
1
1
1
1
1
add 1:
1 1
which gives:
1
1
1
= −80
3 Add 49 and −80:
−128 0
64 0
32 1
16 1
1
1
1
8 0
4 0
2 0
1 1
1
+ = 1
1
1
This gives us 1 1 1 0 0 0 0 1, which is −31 in denary; the correct answer.
ACTIVITY 1D Carry out these binary additions and subtractions using these 8-bit column weightings:
−128
64
32
16
8
4
2
1
a) 0 0 1 1 1 0 0 1 + 0 0 1 0 1 0 0 1 b) 0 1 0 0 1 0 1 1 + 0 0 1 0 0 0 1 1 c) 0 1 0 1 1 0 0 0 + 0 0 1 0 1 0 0 0 d) 0 1 1 1 0 0 1 1 + 0 0 1 1 1 1 1 0 e) 0 0 0 0 1 1 1 1 + 0 0 0 1 1 1 0 0 f) 0 1 1 0 0 0 1 1 − 0 0 1 1 0 0 0 0 g) 0 1 1 1 1 1 1 1 − 0 1 0 1 1 0 1 0 h) 0 0 1 1 0 1 0 0 − 0 1 0 0 0 1 0 0 i) 0 0 0 0 0 0 1 1 − 0 1 1 0 0 1 0 0 j) 1 1 0 1 1 1 1 1 − 1 1 0 0 0 0 1 1
Measurement of the size of computer memories The byte is the smallest unit of memory in a computer. Some computers use larger bytes, such as 16-bit systems and 32-bit systems, but they are always multiples of 8. 1 byte of memory wouldn’t allow you to store very much information; so memory size is measured in these multiples. See Table 1.1.
6
457591_01_CI_AS & A_Level_CS_001-026.indd 6
25/04/19 9:11 AM
Name of memory size
Equivalent denary value (bytes)
1 kilobyte (1 KB)
1
1 000
1 megabyte (1 MB)
1 000 000
1 gigabyte (1 GB)
1 000 000 000
1 terabyte (1 TB)
1 000 000 000 000
1 petabyte (1 PB)
1 000 000 000 000 000
▲ Table 1.1 Memory size and denary values
1.1 Data representation
The system of numbering shown in Table 1.1 only refers to some storage devices, but is technically inaccurate. It is based on the SI (base 10) system of units where 1 kilo is equal to 1000. A 1 TB hard disk drive would allow the storage of 1 × 1012 bytes according to this system. However, since memory size is actually measured in terms of powers of 2, another system has been proposed by the International Electrotechnical Commission (IEC); it is based on the binary system. See Table 1.2. Name of memory size
Number of bytes Equivalent denary value (bytes)
1 kibibyte (1 KiB)
210
1 mebibyte (1 MiB)
220
1 048 576
1 gibibyte (1 GiB)
230
1 073 741 824
1 tebibyte (1 TiB)
240
1 099 511 627 776
1 pebibyte (1 PiB)
250
1 125 899 906 842 624
1 024
▲ Table 1.2 IEC memory size system
This system is more accurate. Internal memories (such as RAM) should bemeasured using the IEC system. A 64 GiB RAM could, therefore, store 64×230 bytes of data (68 719 476 736 bytes). See Section 1.2 for examples of how to calculate the size of a file.
1.1.3 Hexadecimal number system The hexadecimal system is very closely related to the binary system. Hexadecimal (sometimes referred to as simply hex) is a base 16 system with the weightings: 1 048 576 (165)
65 536 (164)
4096 (163)
256 (162)
16 (161)
1 (160)
Because it is a system based on 16 different digits, the numbers 0 to 9 and the letters A to F are used to represent hexadecimal digits. A = 10, B = 11, C = 12, D = 13, E = 14 and F = 15. Since 16 = 24, four binary digits are equivalent to each hexadecimal digit. Table 1.3 summarises the link between binary, hexadecimal and denary.
7
457591_01_CI_AS & A_Level_CS_001-026.indd 7
25/04/19 9:11 AM
Binary value
1 Information representation and multimedia
1
Hexadecimal value
Denary value
0000
0001
1
1
0010
2
2
0011
3
3
0100
4
4
0101
5
5
0110
6
6
0111
7
7
1000
8
8
1001
9
9
1010
A
10
1011
B
11
1100
C
12
1101
D
13
1110
E
14
1111
F
15
▲ Table 1.3 The link between binary, hexadecimal and denary
Converting from binary to hexadecimal and from hexadecimal to binary Converting from binary to hexadecimal is a fairly easy process. Starting from the right and moving left, split the binary number into groups of 4 bits. If the last group has less than 4 bits, then simply fill in with 0s from the left. Take each group of 4 bits and convert it into the equivalent hexadecimal digit using Table 1.3. Examples 1.5 and 1.6 show you how this works.
Example 1.5
Convert 1 0 1 1 1 1 1 0 0 0 0 1 from binary to hexadecimal.
Solution First split it into groups of 4 bits: 1011 1110 0001 Then find the equivalent hexadecimal digits: B E 1
Example 1.6
Convert 1 0 0 0 0 1 1 1 1 1 1 1 0 1 from binary to hexadecimal.
Solution First split it into groups of 4 bits: 10 0001 1111 1101 The left group only contains 2 bits, so add in two 0s to the left: 0010 0001 1111 1101 Now find the equivalent hexadecimal digits: 2 1 F D 8
457591_01_CI_AS & A_Level_CS_001-026.indd 8
25/04/19 9:11 AM
ACTIVITY 1E
1 1.1 Data representation
Convert these binary numbers into hexadecimal. a) 1 1 0 0 0 0 1 1 f) 1 0 0 0 1 0 0 1 1 1 1 0 b) 1 1 1 1 0 1 1 1 g) 0 0 1 0 0 1 1 1 1 1 1 1 0 c) 1 0 0 1 1 1 1 1 1 1 h) 0 1 1 1 0 1 0 0 1 1 1 0 0 d) 1 0 0 1 1 1 0 1 1 1 0 i) 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 e) 0 0 0 1 1 1 1 0 0 0 0 1 j) 0 0 1 1 0 0 1 1 1 1 0 1 0 1 1 1 0
Converting from hexadecimal to binary is also straightforward. Using the data from Table 1.3, simply take each hexadecimal digit and write down the 4 bit code which corresponds to the digit.
Example 1.7
Convert this hexadecimal number to its binary equivalent. 4 5 A
Solution Using Table 1.3, find the 4-bit code for each digit: 0 1 0 0 0 1 0 1 1 0 1 0 Put the groups together to form the binary number: 010001011010
Example 1.8
Convert this hexadecimal number to its binary equivalent. B F 0 8
Solution Using Table 1.3: 1 0 1 1 1 1 1 1 0 0 0 0 1 0 0 0 Then put all the digits together: 1011111100001000
Use of the hexadecimal system This section reviews two uses of the hexadecimal system. Memory dumps It is much easier to work with: B5A41AFC than it is to work with: 10110101101001000001101011111100 So, hexadecimal is often used when developing new software or when trying to trace errors in programs. When the memory contents are output to a printer or monitor, this is known as a memory dump. 9
457591_01_CI_AS & A_Level_CS_001-026.indd 9
25/04/19 9:11 AM
1 Information representation and multimedia
1
ACTIVITY 1F Convert these hexadecimal numbers into binary. a) 6 C b) 5 9 c) A A d) A 0 0 e) 4 0 E f) B A 6 g) 9 C C h) 4 0 A A i) D A 4 7 j) 1 A B 0
00990F60 00990F77 00990E8E 00990EA5 00990EBC 00990ED3 00990EEA
54 61 74 6D 74 6C 6E
68 20 79 6F 65 6F 6F
69 6D 70 72 6E 63 74
73 65 69 79 74 61 61
20 6D 63 20 73 74 74
69 6F 61 73 20 69 69
73 72 6C 68 6F 6F 6F
20 79 20 6F 66 6E 6E
61 20 20 77 20 73 20
6E 64 63 69 61 20 20
20 75 6F 6E 20 20 00
65 6D 6D 67 6E 69 00
78 70 70 20 75 6E 00
61 20 75 74 6D 20 00
6D 66 74 68 62 20 00
70 72 65 65 65 68 00
6C 6F 72 20 72 65 00
65 6D 20 20 20 78 00
20 20 20 63 20 20 00
6F 20 6D 6F 6F 20 00
66 61 85 6E 66 20 00
▲ Table 1.4 Memory dump
A program developer can look at each of the hexadecimal codes (as shown in Table 1.4) and determine where the error lies. The value on the far left shows the memory location, so it is possible to find out exactly where in memory the fault occurs. Using hexadecimal is more manageable than binary. It is a powerful fault-tracing tool, but requires considerable knowledge of computer architecture to be able to interpret the results.
1.1.4 Binary-coded decimal (BCD) system The binary-coded decimal (BCD) system uses a 4-bit code to represent each denary digit: 0 0 0 0 = 0
0101=5
0 0 0 1 = 1
0110=6
0 0 1 0 = 2
0111=7
0 0 1 1 = 3
1000=8
0 1 0 0 = 4
1001=9
Therefore, the denary number 3 1 6 5 would be 0 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 in BCD format. The 4-bit code can be stored in the computer either as half a byte or two 4-bit codes stored together to form one byte. For example, using 3 1 6 5 again …
Method 1: four single bytes 0
1
1
3
1
1
1
1
6
1
1
5
Method 2: two bytes 0
1
1
1
3
1
1
1
1
1
6
5
ACTIVITY 1G 1 Convert these denary numbers into BCD format. a) 2 7 1 b) 5 0 0 6 c) 7 9 9 0 2 Convert these BCD numbers into denary numbers. a) 1 0 0 1 0 0 1 1 0 1 1 1 b) 0 1 1 1 0 1 1 1 0 1 1 0 0 0 1 0 10
457591_01_CI_AS & A_Level_CS_001-026.indd 10
25/04/19 9:11 AM
Uses of BCD The most obvious use of BCD is in the representation of digits on a calculator or clock display.
As you will learn in Chapter 13, it is nearly impossible to represent decimal values exactly in computer memories which use the binary number system. Normally this doesn’t cause a major issue since the differences can be dealt with. However, when it comes to accounting and representing monetary values in computers, exact values need to be stored to prevent significant errors from accumulating. Monetary values use a fixed-point notation, for example $1.31, so one solution is to represent each denary digit as a BCD value.
1.1 Data representation
Each denary digit will have a BCD equivalent value which makes it easy to convert from computer output to denary display.
1
Consider adding $0.37 and $0.94 together using fixed-point decimals. $0.37
0 0 0 0 0 0 0 0
+ $0.94
.
0 0 1 1 0 1 1 1
+ 0 0 0 0 0 0 0 0
.
1 0 0 1 0 1 0 0
Expected result = $1.31
Using binary addition, this sum will produce: 0 0 0 0 0 0 0 0 . 1 1 0 0 1 0 1 1 which produces 1 1 0 0 (denary 12) and 1 0 11 (denary 11), which is clearly incorrect. The problem was caused by 3 + 9 = 12 and 7 + 4 = 11, as neither 12 nor 11 are single denary digits. The solution to this problem, enabling the computer to store monetary values accurately, is to add 0 1 1 0 (denary 6) whenever such a problem arises. The computer can be programmed to recognise this issue and add 0 1 1 0 at each appropriate point. If we look at the example again, we can add .07 and .04 (the two digits in the second decimal place) first. 0
1
1
1
+ 0
1 =
This now produces a fifth bit which is carried to the next decimal digit position.
1
1
1
1
1
1
1
1
This produces 1 0 1 1 which isn’t a denary digit; this will flag an error and the computer needs to add 0 1 1 0.
+ 0
1 =
1
11
457591_01_CI_AS & A_Level_CS_001-026.indd 11
25/04/19 9:11 AM
Now we will add .3 and .9 together (the two digits in the first decimal place) remembering the carry bit from the addition above:
1
1
1
1
1 Information representation and multimedia
+ 1
0 +
1 = 1
1
1
This produces 1 1 0 1 which isn’t a denary digit; this will flag an error and the computer again needs to add 0 1 1 0.
1
Carry out these BCD additions. a) 0.45 + 0.21 b) 0.66 + 0.51 c) 0.88 + 0.75
1
1
1
1
+
This again produces a fifth bit which is carried to the next decimal digit position.
ACTIVITY 1H
1
1
= 1
Adding 1 to 0 0 0 0 0 0 0 0 produces: 0
1
1
Final answer: 0
.
1
1
1
which is 1.31 in denary – the correct answer.
1.1.5 ASCII codes and Unicodes The ASCII code system (American Standard Code for Information Interchange) was set up in 1963 for use in communication systems and computer systems. The newer version of the code was published in 1986. The standard ASCII code character set consists of 7-bit codes (0 to 127 denary or 0 to 7F in hexadecimal); this represents the letters, numbers and characters found on a standard keyboard together with 32 control codes (which use up codes 0 to 31 (denary) or 0 to 19 (hexadecimal)). Table 1.5 shows part of the standard ASCII code table (only the control codes have been removed from the table).
12
457591_01_CI_AS & A_Level_CS_001-026.indd 12
25/04/19 9:11 AM
Hex
Char
Dec
Hex
Char
Dec
Hex
Char
32
20
64
40
@
96
60
`
33
21
!
65
41
A
97
61
a
34
22
“
66
42
B
98
62
b
35
23
#
67
43
C
99
63
c
36
24
$
68
44
D
100
64
d
37
25
%
69
45
E
101
65
e
38
26
&
70
46
F
102
66
f
39
27
‘
71
47
G
103
67
g
40
28
(
72
48
H
104
68
h
41
29
)
73
49
I
105
69
i
42
2A
*
74
4A
J
106
6A
j
43
2B
+
75
4B
K
107
6B
k
44
2C
,
76
4C
L
108
6C
l
45
2D
-
77
4D
M
109
6D
m
46
2E
.
78
4E
N
110
6E
n
47
2F
/
79
4F
O
111
6F
o
48
30
80
50
P
112
70
p
49
31
1
81
51
Q
113
71
q
50
32
2
82
52
R
114
72
r
51
33
3
83
53
S
115
73
s
52
34
4
84
54
T
116
74
t
53
35
5
85
55
U
117
75
u
54
36
6
86
56
V
118
76
v
55
37
7
87
57
W
119
77
w
56
38
8
88
58
X
120
78
x
57
39
9
89
59
Y
121
79
y
58
3A
:
90
5A
Z
122
7A
z
59
3B
;
91
5B
[
123
7B
{
60
3C
94
5E
^
126
7E
~
63
3F
?
95
5F
_
127
7F
1 1.1 Data representation
Dec
▲ Table 1.5 Part of the ASCII code table
Notice the storage of characters with uppercase and lowercase. For example: a A y Y
1 1 1 1
1 0 1 0
0 0 1 1
0 0 1 1
0 0 0 0
0 0 0 0
1 1 1 1
hex 61 (lower case) hex 41 (upper case) hex 79 (lower case) hex 59 (uppercase)
Notice the sixth bit changes from 1 to 0 when comparing lower and uppercase characters. This makes the conversion between the two an easy operation. It is also noticeable that the character sets (such as a to z, 0 to 9, and so on) are grouped together in sequence, which speeds up usability. Extended ASCII uses 8-bit codes (128 to 255 in denary or 80 to FF in hex). This allows for non-English characters and for drawing characters to be included. 13
457591_01_CI_AS & A_Level_CS_001-026.indd 13
25/04/19 9:11 AM
1 Information representation and multimedia
1
Dec 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
Hex 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99
Char Ç ü é â ä à å ç ê ë è ï î ì Ä Å É æ Æ ô ö ò û ù ӱ Ö
Dec 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179
Hex 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3
Char Ü ḉ £ ¥ ₧ ƒ á í ó ú ñ Ñ ᵃ ᵒ ¿ ⌐ ¬ ½ ¼ ¡ « » ░ ▒ ▓ │
Dec 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205
Hex Char Dec Hex ┤ 206 CE B4 ╡ 207 CF B5 ╢ 208 D0 B6 ╖ 209 D1 B7 ╕ 210 D2 B8 ╣ 211 D3 B9 212 D4 BA ║ ╗ 213 D5 BB 214 D6 BC ╝ 215 D7 BD ╜ 216 D8 BE ╛ ┐ 217 D9 BF 218 DA C0 └ 219 DB C1 ┴ ┬ 220 DC C2 221 DD C3 ├ ─ 222 DE C4 223 DF C5 ┼ 224 E0 C6 ╞ 225 E1 C7 ╟ 226 E2 C8 ╚ ╔ 227 E3 C9 228 E4 CA ╩ ╦ 229 E5 CB 230 E6 CC ╠ 231 E7 CD ═
Char Dec Hex ╬ 232 E8 233 E9 ╧ 234 EA ╨ ╤ 235 EB ╥ 236 EC 237 ED ╙ 238 EE ╘ ╒ 239 EF ╓ 240 F0 ╫ 241 F1 242 F2 ╪ 243 F3 ┘ ┌ 244 F4 █ 245 F5 ▄ 246 F6 ▌ 247 F7 ▐ 248 F8 249 F9 ▀ α 250 FA ß 251 FB Г 252 FC п 253 FD ∑ 254 FE σ 255 FF μ τ
Char Φ Θ Ω δ ∞ ø ε ∩ ≡ ± ≥ ≤ ⌠ ⌡ ÷ ≈ ° ▪▪ ▪ √ ³ ² ■ □
▲ Table 1.6 Extended ASCII code table
Since ASCII code has a number of disadvantages and is unsuitable for some purposes, different methods of coding have been developed over the years. One coding system is called Unicode. Unicode allows characters in a code form to represent all languages of the world, thus supporting many operating systems, search engines and internet browsers used globally. There is overlap with standard ASCII code, since the first 128 (English) characters are the same, but Unicode can support several thousand different characters in total. As can be seen in Tables 1.5 and 1.6, ASCII uses one byte to represent a character, whereas Unicode will support up to four bytes per character. The Unicode consortium was set up in 1991. Version 1.0 was published with five goals, these were to » create a universal standard that covered all languages and all writing systems » produce a more efficient coding system than ASCII » adopt uniform encoding where each character is encoded as 16-bit or 32-bit code » create unambiguous encoding where each 16-bit or 32-bit value always represents the same character (it is worth pointing out here that the ASCII code tables are not standardised and versions other than the ones shown in tables 1.5 and 1.6 exist) » reserve part of the code for private use to enable a user to assign codes for their own characters and symbols (useful for Chinese and Japanese character sets). A sample of Unicode characters are shown in Table 1.7. As can be seen from the table, characters used in languages such as Russian, Greek, Romanian and Croatian can now be represented in a computer). 14
457591_01_CI_AS & A_Level_CS_001-026.indd 14
25/04/19 9:11 AM
1 ơ
2 Ƣ
3 ƣ
4 Ƥ
5 ƥ
6 Ʀ
7 Ƨ
8 ƨ
9 Ʃ
A ƪ
B ƫ
C Ƭ
D ƭ
E Ʈ
F Ư
Ʊ
Ʋ
Ƴ
ƴ
Ƶ
ƶ
Ʒ
Ƹ
ƹ
ƺ
ƻ
Ƽ
ƽ
ƾ
ƿ
01C0 ǀ 01D0 ǐ
ǁ
ǂ
ǃ
DŽ Dž
dž
LJ
Lj
lj
NJ
Nj
nj
ǎ
Ǒ ǡ
ǒ
ǔ
ǖ
ǘ
Ǚ
ǚ
ǜ
Ǟ
Ǫ
Ǜ ǫ
Ǐ ǟ
Ǣ
Ǔ ǣ
Ǎ ǝ
Ǭ
ǭ
Ǯ
ǯ
DZ Dz
dz
Ǽ
ǽ
Ǿ
ǿ
0200 Ȁ
ȁ
Ȃ
ǻ ȋ
Ȍ
ȍ
Ȏ
ȏ
0210 Ȑ 0220 Ƞ
ȑ ȡ
01E0 Ǡ 01F0 ǰ
0230 Ȱ 0240 ɀ 0250 ɐ 0260 ɠ 0270 ɰ 0280 ʀ
Ǥ
Ǖ ǥ
Ǧ
Ǘ ǧ
Ǩ
ǩ
Ǵ
ǵ
Ƕ
Ƿ
Ǹ
ǹ
ȃ
Ȅ
ȅ
Ȇ
ȇ
Ȉ
ȉ
Ǻ Ȋ
Ȓ
ȓ
Ȕ
ȕ
Ȗ
ȗ
Ș
ș
Ț
ț
Ȝ
ȝ
Ȟ
ȟ
Ȣ
ȣ
Ȥ
ȥ
Ȧ
ȧ
Ȩ
ȩ
ȫ
ȭ
Ȯ
ȯ
ȱ
Ȳ
ȳ
ȴ
ȵ
ȶ
ȷ
ȸ
ȹ
Ȫ Ⱥ
Ȼ
Ȭ ȼ
Ƚ
Ⱦ
ȿ
Ɂ
ɂ
Ƀ
Ʉ
Ʌ
Ɇ
ɇ
Ɉ
ɉ
Ɋ
ɋ
Ɍ
ɍ
Ɏ
ɏ
ɑ
ɒ
ɓ
ɔ
ɕ
ɖ
ɗ
ɘ
ǝ
ɚ
ɛ
ɜ
ɝ
ɞ
ɟ
ɡ
ɢ
ɣ
ɤ
ɥ
ɦ
ɧ
ɨ
ɩ
ɪ
ɫ
ɬ
ɭ
ɮ
ɯ
ɱ
ɲ
ɳ
ɴ
ɵ
ɶ
ɷ
ɸ
ɹ
ɺ
ɻ
ɼ
ɽ
ɾ
ɿ
ʁ
ʂ
ʃ
ʄ
ʅ
ʆ
ʇ
ʈ
ʉ
ʊ
ʋ
ʌ
ʍ
ʎ
ʏ
0290 ʐ 02A0 ʠ
ʑ
ʒ
ʓ
ʔ
ʕ
ʖ
ʗ
ʘ
ʙ
ʚ
ʛ
ʜ
ʝ
ʞ
ʟ
ʡ
ʢ
ʣ
ʤ
ʥ
ʦ
ʧ
ʨ
ʩ
ʪ
ʫ
ʬ
ʭ
ʮ
ʯ
02B0 ʰ
ʱ
ʲ
ʳ
ʴ
ʵ
ʶ
ʷ
ʸ
ʹ
ʺ
ʻ
ʼ
ʽ
ʾ
ʿ
1 1.2 Multimedia
0 01A0 Ơ 01B0 ư
▲ Table 1.7 Sample of Unicode characters
1.2 Multimedia Key terms Bit-map image – system that uses pixels to make up an image. Pixel – smallest picture element that makes up an image. Colour depth – number of bits used to represent the colours in a pixel, e.g. 8 bit colour depth can represent 28 = 256 colours. Bit depth – number of bits used to represent the smallest unit in, for example, a sound or image file – the larger the bit depth, the better the quality of the sound or colour image. Image resolution – number of pixels that make up an image, for example, an image could contain 4096 × 3192 pixels (12 738 656 pixels in total). Screen resolution – number of horizontal and vertical pixels that make up a screen display. If the screen
resolution is smaller than the image resolution, the whole image cannot be shown on the screen, or the original image will become lower quality. Resolution – number of pixels per column and per row on a monitor or television screen. Pixel density – number of pixels per square centimetre. Vector graphics – images that use 2D points to describe lines and curves and their properties that are grouped to form geometric shapes. Sampling resolution – number of bits used to represent sound amplitude (also known as bit depth). Sampling rate – number of sound samples taken per second. Frame rate – number of video frames that make up a video per second.
Images can be stored in a computer in two common formats: bit-map image and vector graphic.
1.2.1 Bit-map images Bit-map images are made up of pixels (picture elements); the image is stored in a two-dimensional matrix of pixels. Pixels can take different shapes, such as
or or or 15
457591_01_CI_AS & A_Level_CS_001-026.indd 15
25/04/19 9:11 AM
1 Information representation and multimedia
1
EXTENSION ACTIVITY 1B Find out how HTML is used to control the colour of each pixel on a screen. How is HTML used in the design stage of a web page screen layout?
When storing images as pixels, we have to consider » at least 8 bits (1 byte) per pixel are needed to code a coloured image (this gives 256 possible colours by varying the intensity of the blue, green and red elements) » true colour requires 3 bytes per pixel (24 bits), which gives more than one million colours » the number of bits used to represent a pixel is called the colour depth.
In terms of images, we need to distinguish between bit depth and colour depth; for example, the number of bits that are used to represent a single pixel (bit depth) will determine the colour depth of that pixel. As the bit depth increases, the number of possible colours which can be represented also increases. For example, a bit depth of 8 bits per pixel allows 256 (28) different colours (the colour depth) to be represented, whereas using a bit depth of 32 bits per pixel results in 4 294 967 296 (232) different colours. The impact of bit depth and colour depth is considered later. We will now consider the actual image itself and how it can be displayed on a screen. There are two important definitions here: » Image resolution refers to the number of pixels that make up an image; for example, an image could contain 4096 × 3192 pixels (12 738 656 pixels in total). » Screen resolution refers to the number of horizontal pixels and the number of vertical pixels that make up a screen display (for example, if the screen resolution is smaller than the image resolution then the whole image cannot be shown on the screen or the original image will now be a lower quality).
We will try to clarify the difference by using an example. Figure 1.1 has been taken by a digital camera using an image resolution of 4096 × 3192 pixels:
▲ Figure 1.1 Image taken by a digital camera
▲ Figure 1.2 Image cropped and rotated through 90°
Suppose we wish to display Figure 1.1 on a screen with screen resolution of 1920 × 1080. To display this image the web browser (or other software) would need to re-size Figure 1.1 so that it now fits the screen. This could be done by removing pixels so that it could now be displayed, or part of the image could be cropped (and, in this case, rotated through 90°) as shown in Figure 1.2.
16
457591_01_CI_AS & A_Level_CS_001-026.indd 16
25/04/19 9:11 AM
However, a lower resolution copy of Figure 1.1 (for example, 1024 × 798) would now fit on the screen without any modification to the image. We could simply zoom in to enlarge it to full screen size; however, the image could now become pixelated (in other words, the number of pixels per square inch (known as the pixel density) is smaller, causing deterioration in the image quality).
1
We will now consider a calculation which shows how pixel density can be calculated for a given screen. Imagine we are using an Apple iPhone 8 which has 5.5-inch screen size and screen resolution of 1920 pixels × 1080 pixels:
(
)
This gives us the pixel density of 401 pixels per square inch (ppi) (which is the same as the published figure from the manufacturer).
1.2 Multimedia
1 add together the squares of the resolution size ((19202 + 10802) = (3 686 400 + 16 640) = 4 852 800) 2 find the square root 4852800 = 2202.907 3 divide by screen size (2202.907 ÷ 5.5 = 401)
A pixel-generated image can be scaled up or scaled down; it is important to understand that this can be done when deciding on the resolution. The resolution can be varied on many cameras before taking, for example, a digital photograph. When magnifying an image, the number of pixels that makes up the image remains the same but the area they cover is now increased. This means some of the sharpness could be lost. This is known as the pixel density and is key when scaling up photographs. For example, look at Figure 1.3.
A
B
C
D
E
▲ Figure 1.3 Five images of the same car wheel
Image A is the original. By the time it has been scaled up to make image E it has become pixelated (‘fuzzy’). This is because images A and E have different pixel densities. The main drawback of using high resolution images is the increase in file size. As the number of pixels used to represent the image is increased, the size of the file will also increase. This impacts on how many images can be stored on, for example, a hard drive. It also impacts on the time to download an image from the internet or the time to transfer images from device to device. Bit-map images rely on certain properties of the human eye and, up to a point, the amount of file compression used (see Section 1.3 File compression). The eye can tolerate a certain amount of resolution reduction before the loss of quality becomes significant.
EXTENSION ACTIVITY 1C Calculate the file size needed to store the screen image on a UHD television.
Calculating bit-map image file sizes It is possible to estimate the file size needed to store a bit-map image. The file size will need to take into account the image resolution and bit depth. For example, a full screen with a resolution of 1920 × 1080 pixels and a bit depth of 24 requires 1920 × 1080 × 24 bits = 49 766 400 bits for the full screen image. Dividing by 8 gives us 6 220 800 bytes (equivalent to 6.222 MB using the SI units or 5.933 MiB using IEE units). An image which does not occupy the full screen will obviously result in a smaller file size. 17
457591_01_CI_AS & A_Level_CS_001-026.indd 17
25/04/19 9:11 AM
Note: when saving a bit-map image, it is important to include a file header; this will contain items such as file type (.bmp or .jpeg), file size, image resolution, bit depth (usually 1, 8, 16, 24 or 32), any type of data compression employed and so on.
1 1 Information representation and multimedia
1.2.2 Vector graphics Vector graphics are images that use 2D points to describe lines and curves and their properties that are grouped to form geometric shapes. Vector graphics can be designed using computer aided design (CAD) software or using an application which uses a drawing canvas on the screen. See Figure 1.4. A vector graphic will contain a drawing list (included in a file header) that is made up of ▲ Figure 1.4 Drawing of a robot made up of a number of geometric shapes
» the command used for each object that makes up the graphic image » the attributes that define the properties that make up each object (for example consider the ellipse of the robot’s mouth – this will need the position of the two centres, the radius from centres, the thickness and style of each line, the line colour and any fill colour used) » the relative position of each object will also need to be included » the dimensions of each object are not defined, but the relative positions of objects to each other in the final graphic need to be defined; this means that scaling up the vector graphic image will result in no loss of quality.
When printing out vector graphics it is usually necessary to first convert it into a bit-map image to match the format of most printers.
Comparison between vector graphics and bit-map images Vector graphic images
Bit-map images
made up of geometric shapes which require definition/attributes
made up of tiny pixels of different colours
to alter/edit the design, it is necessary to change each of the geometric shapes
possible to alter/edit each of the pixels to change the design of the image
they do not require large file size since because of the use of pixels (which give very it is made up of simple geometric shapes accurate designs), the file size is very large because the number of geometric shapes is limited, vector graphics are not usually very realistic
since images are built up pixel by pixel, the final image is usually very realistic
file formats are usually .svg, .cgm, .odg
file formats are usually .jpeg, .bmp, .png
▲ Table 1.8 Comparison between vector graphics and bit-map images
It is now worth considering whether a vector graphic or a bit-map image would be the best choice for a given application. When deciding which is the better method, we should consider the following: » Does the image need to be resized? If so, a vector graphic could be the best option. » Does the image need to be drawn to scale? Again, a vector graphic is probably the best option. » Does the image need to look real? Usually bit-map images look more realistic than vector graphics. » Are there file restrictions? If so, it is important to consider whether vector graphic images can be used; if not, it would be necessary to consider the image resolution of a bit-map image to ensure the file size is not too large.
18
457591_01_CI_AS & A_Level_CS_001-026.indd 18
25/04/19 9:11 AM
For example, when designing a logo for a company or composing an ‘exploded diagram’ of a car engine, vector graphics are the best choice.
1
However, when modifying photographs using photo software, the best method is to use bit-map images.
1.2.3 Sound files 1.2 Multimedia
Sound requires a medium in which to travel through (it cannot travel in a vacuum). This is because it is transmitted by causing oscillations of particles within the medium. The human ear picks up these oscillations (changes in air pressure) and interprets them as sound. Each sound wave has a frequency and wavelength; the amplitude specifies the loudness of the sound.
Pressure
high frequency wave
Time
period
Pressure
low frequency wave
Time
period
▲ Figure 1.5 High and low frequency wave signals
Sound amplitude
Sound is an analogue value; this needs to be digitised in order to store sound in a computer. This is done using an analogue to digital converter (ADC). If the sound is to be used as a music file, it is often filtered first to remove higher frequencies and lower frequencies which are outside the range of human hearing. To convert the analogue data to digital, the sound waves are sampled at a given time rate. The amplitude of the sound cannot be measured precisely, so approximate values are stored.
10 9 8 7 6 5 4 3 2 1 0
1
2
3
4
5
6
7
8
9 10 11 12 Time intervals
13
14
15
16
17
18
19
20
▲ Figure 1.6 A sound wave
19
457591_01_CI_AS & A_Level_CS_001-026.indd 19
25/04/19 9:11 AM
1 Information representation and multimedia
1
Figure 1.6 shows a sound wave. The x-axis shows the time intervals when the sound was sampled (0 to 20), and the y-axis shows the amplitude of the sampled sound (the amplitudes above 10 and below 0 are filtered out in this example). At time interval 1, the approximate amplitude is 9; at time interval 2, the approximate amplitude is 4, and so on for all 20 time intervals. Because the amplitude range in Figure 1.6 is 0 to 10, then 4 binary bits can be used to represent each amplitude value (for example, 9 would be represented by the binary value 1001). Increasing the number of possible values used to represent sound amplitude also increases the accuracy of the sampled sound (for example, using a range of 0 to 127 gives a much more accurate representation of the sound sample than using a range of, for example, 0 to 10). This is known as the sampling resolution (also known as the bit depth). Sampling rate is the number of sound samples taken per second. The higher the sampling rate and/or sampling resolution, the greater the file size. For example, a 16-bit sampling resolution is used when recording CDs to give better sound quality. So, how is sampling used to record a sound clip? » The amplitude of the sound wave is first determined at set time intervals (the sampling rate). » This gives an approximate representation of the sound wave. » The sound wave is then encoded as a series of binary digits. Using a higher sampling rate or larger resolution will result in a more faithful representation of the original sound source. Pros
Cons
larger dynamic range
produces larger file size
better sound quality
takes longer to transmit/download sound files
less sound distortion
requires greater processing power
▲ Table 1.9 The pros and cons of using a larger sampling resolution when recording sound
Recorded sound is often edited using software. Common features of such software include the ability to » » » » » » »
edit the start/stop times and duration of a sample extract and save (or delete) part of a sample alter the frequency and amplitude of a sample fade in and fade out mix and/or merge multiple sound tracks or sources combine various sound sources together and alter their properties remove ‘noise’ to enhance one sound wave in a multiple of waves (for example, to identify and extract one person’s voice out of a group of people) » convert between different audio formats.
1.2.4 Video This section considers the use of video and extends beyond the syllabus. While this is not specifically mentioned in the syllabus, it has been included here for completeness. Many specialist video cameras exist. However, most digital cameras, smart phones and tablets are also capable of taking moving images by ‘stitching’ a number of still photos (frames) together. They are often referred to as DV (digital video) cameras; they store compressed photo frames at a speed of 25 MB per second – this is known as motion JPEG. 20
457591_01_CI_AS & A_Level_CS_001-026.indd 20
25/04/19 9:11 AM
In both single frame and video versions, the camera picks up the light from the image and turns it into an electronic signal using light-sensitive sensors. In the case of the DV cameras, these signals are automatically converted into a compressed digital file format.
1
When recording video, the frame rate refers to the number of frames recorded per second.
Key terms Lossless file compression – file compression method where the original file can be restored following decompression. Lossy file compression – file compression method where parts of the original file cannot be recovered during decompression, so some of the original detail is lost. JPEG – Joint Photographic Expert Group – a form of lossy file compression based on the inability of the eye to spot certain colour changes and hues. MP3/MP4 files – file compression method used for music and multimedia files.
Audio compression – method used to reduce the size of a sound file using perceptual music shaping. Perceptual music shaping – method where sounds outside the normal range of hearing of humans, for example, are eliminated from the music file during compression. Bit rate – number of bits per second that can be transmitted over a network. It is a measure of the data transfer rate over a digital telecoms network. Run length encoding (RLE) – a lossless file compression technique used to reduce text and photo files in particular.
1.3 File compression
1.3 File compression
It is often necessary to reduce the file size of a file to either save storage space or to reduce the time taken to stream or transmit data from one device to another (see Chapter 2). The two most common forms of file compression are lossless file compression and lossy file compression.
Lossless file compression With this technique, all the data from the original file can be reconstructed when the file is uncompressed again. This is particularly important for files where loss of any data would be disastrous (such as a spreadsheet file of important results). Lossy file compression With this technique, the file compression algorithm eliminates unnecessary data (as with MP3 and JPEG formats, for example). Lossless file compression is designed to lose none of the original detail from the file (such as Run-Length Encoding (RLE) which is covered later in this chapter). Lossy file compression usually results in some loss of detail when compared to the original; it is usually impossible to reconstruct the original file. The algorithms used in the lossy technique have to decide which parts of the file are important (and need to be kept) and which parts can be discarded. We will now consider file compression techniques applied to multimedia files.
1.3.1 File compression applications MPEG-3 (MP3) and MPEG-4 (MP4) MPEG-3 (MP3) uses technology known as audio compression to convert music and other sounds into an MP3 file format. Essentially, this compression technology will reduce the size of a normal music file by about 90%. For example, an 80 MB music file on a CD can be reduced to 8 MB using MP3 technology. 21
457591_01_CI_AS & A_Level_CS_001-026.indd 21
25/04/19 9:11 AM
MP3 files are used in MP3 players, computers or mobile phones. Music files can be downloaded or streamed from the internet in a compressed format, or CD files can be converted to MP3 format. While streamed or MP3 music quality cannever match the ‘full’ version found on a CD, the quality is satisfactory for most purposes.
1 1 Information representation and multimedia
But how can the original music file be reduced by 90% while still retaining most of the music quality? This is done using file compression algorithms that use perceptual music shaping. Perceptual music shaping removes certain sounds. For example » frequencies that are outside the human hearing range » if two sounds are played at the same time, only the louder one can be heard by the ear, so the softer sound is eliminated.
This means that certain parts of the music can be removed without affecting the quality too much. MP3 files use what is known as a lossy format, since part of the original file is lost following the compression algorithm. This means that the original file cannot be put back together again. However, even the quality of MP3 files can be different, since it depends on the bit rate – this refers to the number of bits per second used when creating the file. Bit rates are between 80 and 320 kilobits per second; usually 200 kilobits or higher gives a sound quality close to a normal CD.
EXTENSION ACTIVITY 1D Find out how file compression can be applied to a photograph without noticeably reducing its quality. Compare this to run-length encoding (RLE), described below.
MPEG-4 (MP4) files are slightly different to MP3 files. This format allows the storage of multimedia files rather than just sound. Music, videos, photos and animation can all be stored in the MP4 format. Videos, for example, could be streamed over the internet using the MP4 format without losing any real discernible quality (see Chapter 2 for notes on video streaming).
Photographic (bit-map) images When a photographic file is compressed, both the file size and quality of image are reduced. A common file format for images is JPEG, which uses lossy file compression. Once the image is subjected to the JPEG compression algorithm, a new file is formed and the original file can no longer be constructed. A JPEG will reduce the raw bit-map image by a factor of between 5 and 15, depending on the quality of the original. Vector graphics can also undergo some form of file compression. Scalable vector graphics (.svg) are defined in XML text files which, therefore, allows them to be compressed.
Run-length encoding (RLE) Run-length encoding (RLE) can be used to compress a number of different file formats. It is a form of lossless/reversible file compression that reduces the size of a string of adjacent, identical data (such as repeated colours in an image). A repeating string is encoded into two values. The first value represents the number of identical data items (such as characters) in the run. The second value represents the code of the data item (such as ASCII code if it is a keyboard character). RLE is only effective where there is a long run of repeated units/bits. 22
457591_01_CI_AS & A_Level_CS_001-026.indd 22
25/04/19 9:11 AM
Using RLE on text data Consider the text string ‘aaaaabbbbccddddd’. Assuming each character requires 1 byte, then this string needs 16 bytes. If we assume ASCII code is being used, then the string can be coded as follows: a
a
a
a
a
b
05 97
b
b
b
c
04 98
c
d
d
02 99
d
d
1
d
05 100
One issue occurs with a string such as ‘cdcdcdcdcd’, where compression is not very effective. To cope with this we use a flag. A flag preceding data indicates that what follows are the number of repeating units (for example, 255 05 97 where 255 is the flag and the other two numbers indicate that there are five items with ASCII code 97). When a flag is not used, the next byte(s) are taken with their face value and a run of 1 (for example, 01 99 means one character with ASCII code 99 follows).
1.3 File compression
This means we have five characters with ASCII code 97, four characters with ASCII code 98, two characters with ASCII code 99, and five characters with ASCII code 100. Assuming each number in the second row requires 1 byte of memory, the RLE code will need 8 bytes. This is half the original file size.
Consider this example: String Code
aaaaaaaa bbbbbbbbbb 08 97
10 98
c
d
c
d
c
d
eeeeeeee
01 99 01 100 01 99 01 100 01 99 01 100
08 101
The original string contains 32 characters and would occupy 32 bytes of storage. The coded version contains 18 values and would require 18 bytes of storage. Introducing a flag (255 in this case) produces: 255 08 97 255 10 98 99 100 99 100 99 100 255 08 101 This has 15 values and would, therefore, require 15 bytes of storage. This is a reduction in file size of about 53%. Using RLE with images Black and white images Figure 1.7 shows the letter F in a grid where each square requires 1 byte of storage. A white square has a value 1 and a black square a value of 0. 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1
In compressed RLE format this becomes: 9W 6B 2W 1B 7W 1B 7W 5B 3W 1B 7W 1B 7W 1B 6W
1 0 0 0 0 0 1 1 1 0 1 1 1 1 1 1
Using W = 1 and B = 0 we get:
1 0 1 1 1 1 1 1
91 60 21 10 71 10 71 50 31 10 71 10 71 10 61
1 0 1 1 1 1 1 1
▲ Figure 1.7 Using RLE with a black and white image
The 8 × 8 grid would need 64 bytes; the compressed RLE format has 30 values, and therefore needs only 30 bytes to store the image. 23
457591_01_CI_AS & A_Level_CS_001-026.indd 23
25/04/19 9:11 AM
1
Coloured images Figure 1.8 shows an object in four colours. Each colour is made up of red, green and blue (RGB) according to the code on the right.
1 Information representation and multimedia
Square colour
Red
Green Components
Blue
255
255
255
255
255
▲ Figure 1.8 Using RLE with a coloured image
This produces the following data: 2 0 0 0 4 0 255 0 3 0 0 0 6 255 255 255 1 0 0 0 2 0 255 0 4 255 0 0 4 0 255 0 1 255 255 255 2 255 0 0 1 255 255 255 4 0 255 0 4 255 0 0 4 0 255 0 4 255 255 255 2 0 255 0 1 0 0 0 2 255 255 255 2 255 0 0 2 255 255 255 3 0 0 0 4 0 255 0 2 0 0 0 The original image (8 × 8 square) would need 3 bytes per square (to include all three RGB values). Therefore, the uncompressed file for this image is 8 × 8 × 3 = 192 bytes. The RLE code has 92 values, which means the compressed file will be 92 bytes in size. This gives a file reduction of about 52%. It should be noted that the file reductions in reality will not be as large as this due to other data which needs to be stored with the compressed file (such as a file header).
1.3.2 General methods of compressing files All the above file compression techniques are excellent for very specific types of file. However, it is also worth considering some general methods to reduce the size of a file without the need to use lossy or lossless file compression: reduce the sampling rate used movie files
reduce the sampling resolution reduce the frame rate
crop the image image files
decrease the colour/bit depth reduce the image resolution
▲ Figure 1.9 General methods of compressing files
24
457591_01_CI_AS & A_Level_CS_001-026.indd 24
25/04/19 9:11 AM
ACTIVITY 1I
End of chapter questions
1 1.3 File compression
1 a) What is meant by lossless and lossy file compression? b) Give an example of a lossless file format and an example of a lossy file format. 2 a) Describe how music picked up by a microphone is turned into a digitised music file in a computer. b) Explain why it is often necessary to compress stored music files. Describe how the music quality is essentially retained. 3 a) What is meant by run length encoding? b) Describe how RLE compresses a file. Give an example in your description. 4 a) Describe the differences between bit-map images and vector graphics. b) A software designer needs to incorporate images into her software to add realism. Explain what she needs to consider when deciding between using bit-map images and vector graphics in her software.
1 a) The following bytes represent binary integers using the two’s complement form. State the equivalent denary values. i) 0 1 0 0 1 1 1 1 [1] [1] ii) 1 0 0 1 1 0 1 0 iii) Write the integer −53 in two’s complement form. [1] iv) Write the maximum possible range of numbers using the two’s complement form of an 8-bit binary number. Give your answers in denary. [2] b) i) Write the denary integer 798 in binary-coded decimal (BCD) format. [1] ii) Write the denary number that is represented by the following BCD number. 1
1
1
1
1
1
1
1
1
1
[2] c) Give one use of binary-coded decimal system. [1] 2 A software developer is using a microphone and a sound editing app to collect and edit sounds for his new game. When collecting sounds, the software developer can decide on the sampling resolution he wishes to use. a) i) State what is meant by sampling resolution.[1] ii) Describe how sampling resolution will affect how accurate the stored digitised sound will be. [2] b) The software developer will include images in his new game. i) Explain the term image resolution.[1] ii) The software developer is using 16-colour bit-map images. State the number of bits required to encode data for one pixel of his image.[1] iii) One of the images is 16 384 pixels wide and 512 pixels high. The developer decides to save it as a 256-colour bit-map image. Calculate the size of the image file in gibibytes. [3] ➔
457591_01_CI_AS & A_Level_CS_001-026.indd 25
25
25/04/19 9:11 AM
1 Information representation and multimedia
1
iv) The bit-map image will contain a header. State two items you would expect to see in the header. v) Give three features you would expect to see in the sound editing app. 3 The editor of a movie is finalising the music score. They will send the final version of the score to the movie producer by email attachment. a) Describe how sampling is used to record the music sound clips. b) The music sound clips need to undergo some form of data compression before the music editor can send them via email. Identify the type of compression, lossy or lossless, they should use. Give a justification for your answer. c) One method of data compression is known as run length encoding (RLE). i) Explain what is meant by RLE. ii) Show how RLE would be used to produce a compressed file for the image below. Write down the data you would expect to see in the RLE compressed format (you may assume that the grey squares have a code value of 85 and the white squares have a code value of 255).
[2] [3]
[3]
[3] [3]
[4]
4 a) Write the denary numbers 60, 27 and −27 in 8-bit binary two’s complement form. [3] b) Show the result of the addition 60 + 27 using 8-bit binary two’s complement form. Show all of your working. [2] c) Show the result of the subtraction 60 − 27 using 8-bit binary two’s complement form. [2] d) Give the result of the following addition. 01011001 + 01100001 Explain why the expected result is not obtained. [2] 5 a) Carry out 0.52 + 0.83 using binary-coded decimal (BCD). Show all of your working. [4] b) i) Define the term hexadecimal.[1] ii) Give two uses of the hexadecimal system. [2] iii) Convert the following binary number into hexadecimal. [2] 0111111011110010 6 a) Convert the denary number 95 into binary coded decimal (BCD). [1] b) Using two’s complement, carry out the binary subtraction: 00100011–01000100 and convert your answer into denary. [3] c) Convert the denary number 506 into hexadecimal. [1]
26
457591_01_CI_AS & A_Level_CS_001-026.indd 26
25/04/19 9:11 AM
2
Communication ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
the benefits of networking devices the characteristics of a local area network (LAN) and a wide area network (WAN) client-server and peer-to-peer models in networking the differences between thin client and thick client bus, star, mesh and hybrid networking topologies public and private cloud computing the differences between wired and wireless networks (including types of cable and wireless technologies) the hardware required to support a LAN the function of routers Ethernet and how data collisions are detected and avoided bit streaming (including differences between real-time and on-demand streaming of data) the differences between the internet and the World Wide Web (WWW) the hardware needed to support the internet IP addresses (including IPv4, IPv6, public IP addresses and private IP addresses) the use of the uniform resource locator (URL) to locate a resource on the world wide web the role of the domain name service (DNS).
2.1 Networking
In this chapter, you will learn about
WHAT YOU SHOULD ALREADY KNOW Try these three questions before you read this chapter. 1 a) Explain the following terms associated with devices connected to a network/ internet. i) MAC address ii) IP address b) Explain the main differences between a MAC address and an IP address and why it is necessary to have both associated with a device connected to the internet. c) What is the purpose of an internet service provider (ISP)?
d) Explain the function of an internet browser. In what ways is this different to an ISP? 2 A college is about to form a network from 20 stand-alone computers. Describe the hardware and software that might be needed to produce this simple computer network. 3 a) Mobile phones and tablets can be configured to access the internet from any location. Describe the software required to allow this to happen. b) Describe some of the benefits and drawbacks (when compared to a desktop PC) of accessing website pages from a mobile phone.
27
457591_02_CI_AS & A_Level_CS_027-067.indd 27
4/30/19 7:45 AM
2 Communication
2
2.1 Networking Key terms ARPAnet – Advanced Research Projects Agency Network. WAN – wide area network (network covering a very large geographical area). LAN – local area network (network covering a small area such as a single building). MAN – metropolitan area network (network which is larger than a LAN but smaller than a WAN, which can cover several buildings in a single city, such as a university campus). File server – a server on a network where central files and other data are stored. They can be accessed by a user logged onto the network. Hub – hardware used to connect together a number of devices to form a LAN that directs incoming data packets to all devices on the network (LAN). Switch – hardware used to connect together a number of devices to form a LAN that directs incoming data packets to a specific destination address only. Router – device which enables data packets to be routed between different networks (for example, can join LANs to form a WAN). Modem – modulator demodulator. A device that converts digital data to analogue data (to be sent down a telephone wire); conversely it also converts analogue data to digital data (which a computer can process). WLAN – wireless LAN. (W)AP – (wireless) access point which allows a device to access a LAN without a wired connection. PAN – network that is centred around a person or their workspace. Client-server – network that uses separate dedicated servers and specific client workstations. All client computers are connected to the dedicated servers. Spread spectrum technology – wideband radio frequency with a range of 30 to 50 metres. Node – device connected to a network (it can be a computer, storage device or peripheral device). Peer-to-peer – network in which each node can share its files with all the other nodes. Each node has its own data and there is no central server.
Thin client – device that needs access to the internet for it to work and depends on a more powerful computer for processing. Thick client – device which can work both off line and on line and is able to do some processing even if not connected to a network/internet. Bus network topology – network using single central cable in which all devices are connected to this cable so data can only travel in one direction and only one device is allowed to transmit at a time.
Packet – message/data sent over a network from node to node (packets include the address of the node sending the packet, the address of the packet recipient and the actual data – this is covered in greater depth in Chapter 14). Star network topology – a network that uses a central hub/switch with all devices connected to this central hub/switch so all data packets are directed through this central hub/switch. Mesh network topology – interlinked computers/ devices, which use routing logic so data packets are sent from sending stations to receiving stations only by the shortest route. Hybrid network – network made up of a combination of other network topologies. Cloud storage – method of data storage where data is stored on off-site servers. Data redundancy – situation in which the same data is stored on several servers in case of maintenance or repair. Wi-Fi – wireless connectivity that uses radio waves, microwaves. Implements IEEE 802.11 protocols. Bluetooth – wireless connectivity that uses radio waves in the 2.45 GHz frequency band. Spread spectrum frequency hopping – a method of transmitting radio signals in which a device picks one of 79 channels at random. If the chosen channel is already in use, it randomly chooses another channel. It has a range up to 100 metres. WPAN – wireless personal area network. A local wireless network which connects together devices in very close proximity (such as in a user’s house); typical devices would be a laptop, smartphone, tablet and printer. Twisted pair cable – type of cable in which two wires of a single circuit are twisted together. Several twisted pairs make up a single cable. Coaxial cable – cable made up of central copper core, insulation, copper mesh and outer insulation. Fibre optic cable – cable made up of glass fibre wires which use pulses of light (rather than electricity) to transmit data. Gateway – device that connects LANs which use different protocols. Repeater – device used to boost a signal on both wired and wireless networks. Repeating hubs – network devices which are a hybrid of hub and repeater unit. Bridge – device that connects LANs which use the same protocols. Softmodem – abbreviation for software modem; a software-based modem that uses minimal hardware. NIC – network interface card. These cards allow devices to connect to a network/internet (usually associated with a MAC address set at the factory).
28
457591_02_CI_AS & A_Level_CS_027-067.indd 28
4/30/19 7:45 AM
WNIC – wireless network interface cards/controllers.
Buffering – store which holds data temporarily.
Ethernet – protocol IEEE 802.3 used by many wired LANs.
Bit rate – number of bits per second that can be transmitted over a network. It is a measure of the data transfer rate over a digital telecoms network.
Conflict – situation in which two devices have the same IP address. Broadcast – communication where pieces of data are sent from sender to receiver.
CSMA/CD – carrier sense multiple access with collision detection – a method used to detect collisions and resolve the issue. Bit streaming – contiguous sequence of digital bits sent over a network/internet.
On demand (bit streaming) – system that allows users to stream video or music files from a centralserver as and when required without havingtosave the files on their own computer/ tablet/phone. Real-time (bit streaming) – system in which an event is captured by camera (and microphone) connected to a computer and sent to a server where the data is encoded. The user can access the data ‘as it happens’live.
2.1 Networking
Collision – situation in which two messages/data from different sources are trying to transmit along the same data channel.
2
2.1.1 Networking devices One of the earliest forms of networking, circa 1970 in the USA, was the Advanced Research Projects Agency Network (ARPAnet). This was an early form of packet switching wide area network (WAN) connecting a number of large computers in the Department of Defense. It later expanded to include university computers. It is generally agreed that ARPAnet developed the technical platform for what we now call the internet. Figure 2.1 shows the vast area this network covered.
LINCOLN MIT-IPC MIT-MAC T CCA BBN T BBN
LBL RADC T
AMES AMES T
LLL UTAH SRI T XEROX STANFORD T TYMSHARE FNWC T
T HAWAI
GWC T T DOCB
UCSB UCLA
UCSD
SDC RAND
CASE
T USC
HARVARD
CARNEGIE
ILLINOS
ABERDEEN
BELVOIR SDAC T MITRE T ARPA T
T NBS T ETAC
NORSAR T T LONDON
USC–ISI T RML
▲ Figure 2.1 ARPAnet coverage, 1973
As personal computers developed through the 1980s, a local network began to appear. This became known as a local area network (LAN). LANs tended to be much smaller networks (usually inside one building) connecting a number of computers and shared devices, such as printers. WANs typically consist of a number of LANs connected via public communications networks (such as telephone lines or satellites). Because a WAN consists of LANs joined together, it may be a private network, and passwords and user IDs are required to access it. This is in contrast to the internet which is a vast number of decentralised networks and computers which have a common point of access, so that anyone with access to the internet can connect to the computers on these networks. This makes it intrinsically different to a WAN. 29
457591_02_CI_AS & A_Level_CS_027-067.indd 29
4/30/19 7:45 AM
2 Communication
2
In recent years, another type of network – a metropolitan area network (MAN) – has emerged. MANs are larger than LANs as they can connect together many small computer networks (e.g LANs) housed in different buildings within a city (for example, a university campus). MANs are restricted in their size geographically to, for example, a single city. In contrast, WANs can cover a much larger geographical area, such as a country or a continent. For example, a multi-national company may connect a number of smaller networks together (e.g. LANs or MANs) to form a world-wide WAN. This is covered in more detail later. Here are some of the main benefits of networking computers and devices (rather than using a number of stand-alone computers): » Devices, such as printers, can be shared (thus reducing costs). » Licences to run software on networks are often far cheaper than buying licences for an equivalent number of stand-alone computers. » Users can share files and data. » Access to reliable data that comes from a central source, such as a file server. » Data and files can be backed up centrally at the end of each day. » Users can communicate using email and instant messaging. » A network manager can oversee the network and, for example, apply access rights to certain files, or restrict access to external networks, such as the internet.
There are also a number of drawbacks: » » » »
Cabling and servers can be an expensive initial outlay. Managing a large network can be a complex and difficult task. A breakdown of devices, such as the file servers, can affect the whole network. Malware and hacking can affect entire networks (particularly if a LAN is part of a much larger WAN), although firewalls do afford some protection in this respect.
Networked computers Networked computers form an infrastructure which enables internal and external communications to take place. The infrastructure includes the following: Hardware » LAN cards » routers » switches » wireless routers » cabling
Software » operation and management of the network » operation of firewalls » security applications/utilities Services » DSL » satellite communication channels » wireless protocols » IP addressing.
30
457591_02_CI_AS & A_Level_CS_027-067.indd 30
4/30/19 7:45 AM
Networks can be categorised as private or public. Private networks are owned by a single company or organisation (they are often LANs or intranets with restricted user access, for example, passwords and user ids are required to join the network); the companies are responsible for the purchase of their own equipment and software, maintenance of the network and the hiring and training of staff.
WANs and LANs Local area networks (LANs) LANs are usually contained within one building, or within a small geographical area. A typical LAN consists of a number of computers and devices (such as printers) connected to hubs or switches. One of the hubs or switches is usually connected to a router and/or modem to allow the LAN to connect to the internet or become part of a wide area network (WAN).
2.1 Networking
Public networks are owned by a communications carrier company (such as a telecoms company); many organisations will use the network and there are usually no specific password requirements to enter the network – but subnetworks may be under security management.
2
Wireless LANs (WLANs) Wireless LANs (WLANs) are similar to LANs but there are no wires or cables. In other words, they provide wireless network communications over fairly short distances (up to 100 metres) using radio or infrared signals instead of using cables. Devices, known as wireless access points (WAPs), are connected into the wired network at fixed locations. Because of the limited range, most commercial LANs (such as those on a college campus or at an airport) need several WAPs to permit uninterrupted wireless communications. The WAPs use either spread spectrum technology (which is a wideband radio frequency with a range from a few metres to 100 metres) or infrared (which has a very short range of about 1 to 2 metres and is easily blocked, and therefore has limited use; see Section 2.1.5 Wired and wireless networking). The WAP receives and transmits data between the WLAN and the wired network structure. End users access the WLAN through wireless LAN adapters which are built into the devices or as a plug in module.
WAP
WAP
WAP
▲ Figure 2.2 Wireless local area networks (WLAN) 31
457591_02_CI_AS & A_Level_CS_027-067.indd 31
4/30/19 7:45 AM
2 Communication
2
Wide area networks (WANs) Wide area networks (WANs) are used when computers or networks are situated a long distance from each other (for example, they may be in different cities or on different continents). If a number of LANs are joined together using a router or modem, they can form a WAN. The network of automated teller machines (ATMs) used by banks is one of the most common examples of the use of a WAN. Because of the long distances between devices, WANs usually make use of a public communications network (such as telephone lines or satellites), but they can use dedicated or leased communication lines which can be less expensive and more secure (less risk of hacking, for example). A typical WAN will consist of end systems and intermediate systems, as shown in Figure 2.3. 1, 3, 7 and 10 are known as end systems, and the remainder are known as intermediate systems. The distance between each system can be considerable, especially if the WAN is run by a multi-national company. 1
2
4
7
3
5
8
6
9
10
▲ Figure 2.3 A typical WAN
The following is used as a guide for deciding the ‘size’ of a network: WAN: 100 km to over 1000 km MAN: 1 km to 100 km LAN: 10 m to 1000 m PAN: 1 m to 10 m (this is not a commonly used term – it means personal area network; in other words, a home system)
2.1.2 Client-server and peer-to-peer networking models We will consider two types of networking models, client-server and peer-to-peer.
Client-server model
clients
internet
server
Client sends a request to the server and the server finds the requested data and sends it back to the client.
A system administrator manages the whole network; clients are connected through a network; allows data access even over large distances.
▲ Figure 2.4 Client-server model
32
457591_02_CI_AS & A_Level_CS_027-067.indd 32
4/30/19 7:45 AM
2 2.1 Networking
» The client-server model uses separate dedicated servers and specific client workstations; client computers will be connected to the server computer(s). » Users are able to access most of the files, which are stored on dedicated servers. » The server dictates which users are able to access which files. (Note: sharing of data is the most important part of the client-server model; with peer-topeer, connectivity is the most important aspect.) » The client-server model allows the installation of software onto a client’s computer. » The model uses central security databases which control access to the shared resources. (Note: passwords and user IDs are required to log into the network.) » Once a user is logged into the system, they will have access to only those resources (such as a printer) and files assigned to them by the network administrator, so offers greater security than peer-to-peer networks. » Client-server networks can be as large as you want them to be and they are much easier to scale up than peer-to-peer networks. » A central server looks after the storing, delivery and sending of emails. » This model offers the most stable system, for example, if someone deletes a shared resource from the server, the nightly back-up would restore the deleted resource (this is different in peer-to-peer – see later). » Client-server networks can become bottlenecked if there are several client requests at the same time. » In the client-server model, a file server is used and is responsible for – central storage and management of data files, thus enabling other network users to access files – allowing users to share information without the need for offline devices (such as a memory stick) – allowing any computer to be configured as the host machine and act as the file server (note that the server could be a storage device (such as SSD or HDD) that could also serve as a remote storage device for other computers, thus allowing them to access this device as if it were a local storage device attached to their computer).
Examples of use of client-server network model A company/user would choose a client-server network model for the following reasons. » The company/user has a large user-base (however, it should be pointed out that this type of network model may still be used by a small group of people who are doing independent projects but need to have sharing of data and access to data outside the group). » Access to network resources needs to be properly controlled. » There is a need for good network security. » The company requires its data to be free from accidental loss (in other words, data needs to be backed up at a central location).
An example is the company Amazon; it uses the client-server network model. The user front-end is updated every time a user logs on to the Amazon website and a large server architecture handles items such as order processing, billing customers and data security; none of the Amazon users are
33
457591_02_CI_AS & A_Level_CS_027-067.indd 33
4/30/19 7:45 AM
2
aware that other customers are using the website at the same time – there is no interaction between users and server since they are kept entirely separate at all times.
2 Communication
Peer-to-peer model
node
▲ Figure 2.5 Peer-to-peer model
On a peer-to-peer network, each node joins the network to allow » the provision of services to all other network users; the services available are listed on a nominated ‘look up’ computer – when a node requests a service, the ‘look up’ computer is contacted to find out which of the other network nodes can provide the required service » other users on the network to simply access data from another node » communication with other peers connected to the network » peers to be both suppliers and consumers (unlike the client-server model where consumers and resources are kept entirely separate from each other) » peers to participate as equals on the network (again this is different to the client-server model where a webserver and client have different responsibilities).
The peer-to-peer model does not have a central server. Each of the nodes (workstations) on the network can share its files with all the other nodes, and each of the nodes will have its own data. Because there is no central storage, there is no requirement to authenticate users.
This model is used in scenarios where no more than 10 nodes are required (such as a small business) where it is relatively easy for users to be in contact with each other on a regular basis. More than 10 nodes leads to performance and management issues.
34
457591_02_CI_AS & A_Level_CS_027-067.indd 34
4/30/19 7:45 AM
Peer-to-peer offers little data security since there is no central security system. This means it is impossible to know who is authorised to share certain data. Users can create their own network node share point which is the only real security aspect since this gives them some kind of control. However, there are no real authentication procedures.
» The network of users is fairly small. » There is no need for robust security. » They require workstation-based applications rather than being server-based.
An example would be a small business where there is frequent user interaction and there is no need to have the features of a client-server network (for example, a builder with five associated workers located in their own homes who only need access to each other’s diaries, previous jobs, skills-base and so on – when the builder is commissioned to do a job they need to access each other’s computer to check on who is available and who has the appropriate skills).
2.1 Networking
Examples of peer-to-peer network model A user would choose the peer-to-peer network model for one or more of following reasons:
2
Thin clients and thick clients The client-server model offers thin clients and thick clients. These can often refer to both hardware and software. Thin client A thin client is heavily dependent on having access to a server to allow constant access to files and to allow applications to run uninterrupted. A thin client can either be a device or software which needs to be connected to a powerful computer or server to allow processing to take place (the computer or server could be on the internet or could be part of a LAN/MAN/ WAN network). The thin client will not work unless it is connected at all times to the computer or server. A software example would be a web browser which has very limited functions unless it is connected to a server. Other examples include mobile phone apps which need constant access to a server to work. A hardware example is a POS terminal at a supermarket that needs constant access to a server to find prices, charge customers and to do any significant processing. Thick client A thick client can either be a device or software that can work offline or online; it is still able to do some processing whether it is connected to a server or not. A thick client can either be connected to a LAN/MAN/WAN, virtual network, the internet or a cloud computing server. A hardware example is a normal PC/laptop/tablet since it would have its own storage (HDD or SSD), RAM and operating system which means it is capable of operating effectively online or offline. An example of software is a computer game which can run independently on a user’s computer, but can also connect to an online server to allow gamers to play and communicate with each other. Table 2.1 highlights some of the pros and cons of using thick client or thin client hardware. 35
457591_02_CI_AS & A_Level_CS_027-067.indd 35
4/30/19 7:45 AM
Thick clients Thin clients
2 Communication
2
Pros
Cons
n
more robust (device can carry out processing even when not connected to server) n clients have more control (they can store their own programs and data/ files)
n
n
n
less expensive to expand (low-powered and cheap devices can be used) n all devices are linked to a server (data updates and new software installation done centrally) n server can offer protection against hacking and malware
less secure (relies on clients to keep their own data secure) n each client needs to update data and software individually n data integrity issues, since many clients access the same data which can lead to inconsistencies high reliance on the server; if the server goes down or there is a break in the communication link then the devices cannot work n despite cheaper hardware, the start-up costs are generally higher than for thick clients
▲ Table 2.1 Summary of pros and cons of thick and thin client hardware
Table 2.2 highlights the differences between thick and thin client software. Thin client software
Thick client software
n
always relies on a connection to a remote server or computer for it to work
n
can run some of the features of the software even when not connected to a server
n
requires very few local resources (such as SSD, RAM memory or computer processing time)
n
relies heavily on local resources
n
relies on a good, stable and fast network connection for it to work
n
more tolerant of a slow network connection
n
data is stored on a remote server or computer
n
can store data on local resources such as HDD or SSD
▲ Table 2.2 Differences between thin and thick client software
ACTIVITY 2A 1 A company has 20 employees working on the development of a new type of battery for use in mobile phones. Decide which type of network model (client-server or peer-to-peer) would be most suitable. Give reasons for your choice. 2 Another company is made up of a group of financial consultants who advise other companies on financial matters, such as taxation and exporting overseas. Decide which type of network model (client-server or peer-to-peer) would be most suitable. Give reasons for your choice.
2.1.3 Network topologies
There are many ways to connect computers to make complex networks. Here we will consider » » » »
bus networks star networks mesh networks hybrid networks.
36
457591_02_CI_AS & A_Level_CS_027-067.indd 36
4/30/19 7:45 AM
Bus networks A bus network topology uses a single central cable to which all computers and devices are connected. It is easy to expand and requires little cabling. Data can only travel in one direction; if data is being sent between devices then other devices cannot transmit. Terminators are needed at each end to prevent signal reflection (bounce). Bus networks are typically peer-to-peer. The disadvantages of a bus network include:
The advantages of a bus network include: » Even if one node fails, the remainder of the network continues to function. » It is easy to increase the size of the network by adding additional nodes.
2.1 Networking
» If the main cable fails, the whole network goes down. » The performance of the network deteriorates under heavy loading. » The network is not secure since each packet passes through every node.
2
▲ Figure 2.6 Bus network topology
In bus network topology, each node looks at each packet and determines whether or not the address of the recipient in the package matches the node address. If so, the node accepts the packet; if not, the packet is ignored. These are most suitable for situations with a small number of devices with light traffic occurring. For example, a small company or an office environment.
Star networks A star network topology uses a central hub/switch and each computer/device is connected to the hub/switch. Data going from host to host is directed through the central hub/switch. Each computer/device has its own dedicated connection to the central node (hub/switch) – any type of network cable can be used for the connections (see Section 2.1.5 Wired and wireless networking). This type of network is typically a client-server. The disadvantages of a star network include: » The initial installation costs are high. » If the central hub/switch fails, then the whole network goes down.
The advantages of a star network include:
hub/switch
▲ Figure 2.7 Star network topology
» Data collisions are greatly reduced due to the topology. » It is a more secure network since security methods can be applied to the central node and packets only travel to nodes with the correct address. » It is easy to improve by simply installing an upgraded hub. » If one of the connections is broken it only affects one of the nodes.
How packets are handled depends on whether the central node is a switch or a hub. If it is a hub, all the packets will be sent to every device/node on the star network – if the address in the packet matches that of the node, it will be accepted; otherwise, it is ignored (this is similar to the way packets are handled on a bus network). If the central node is a switch, packets will only be sent to nodes where the address matches the recipient address in the packet. The latter is clearly more secure, since only nodes intended to see the packet will receive it. 37
457591_02_CI_AS & A_Level_CS_027-067.indd 37
4/30/19 7:45 AM
2 Communication
2
Star networks are useful for evolving networks where devices are frequently added or removed. They are well suited to applications where there is heavy data traffic.
Mesh networks There are two types of mesh network topologies: routing and flooding. Routing works by giving the nodes routing logic (in other words, they act like a router) so that data is directed to its destination by the shortest route and can be re-routed if one of the nodes in the route has failed. Flooding simply sends the data via all the nodes and uses no routing logic, which can lead to unnecessary loading on the network. It is a type of peer-to-peer network, but is fundamentally different. The disadvantages of a mesh network include: » A large amount of cabling is needed, which is expensive and time consuming. » Set-up and maintenance is difficult and complex.
The advantages of a mesh network include: » » » »
It is easy to identify where faults on the network have occurred. Any broken links in the network do not affect the other nodes. Good privacy and security, since packets travel along dedicated routes. The network is relatively easy to expand.
▲ Figure 2.8 Mesh network topology
There are a number of applications worth considering here: » The internet and WANs/MANs are typical uses of mesh networks. » Many examples include industrial monitoring and control where sensors are set up in mesh design and feedback to a control system which is part of the mesh, for example
– medical monitoring of patients in a hospital – electronics interconnectivity (for example, systems that link large screen televisions, DVDs, set top boxes, and so on); each device will be in a location forming the mesh – modern vehicles use wireless mesh network technology to enable the monitoring and control of many of the components in the vehicle.
38
457591_02_CI_AS & A_Level_CS_027-067.indd 38
4/30/19 7:45 AM
EXTENSION ACTIVITY 2A There appear to be similarities between the peer-to-peer network model and mesh network model.
2
Describe the differences between the two models.
Additional advantages include:
2.1 Networking
Hybrid networks A hybrid network is a mixture of two or more different topologies (bus and star, bus and mesh, and so on). The main advantages and disadvantages depend on which types of network are used to make up the hybrid network, but an additional disadvantage is that they can be very complex to install, configure and maintain. » They can handle large volumes of traffic. » It is easy to identify where a network fault has occurred. » They are very well suited to the creation of larger networks.
▲ Figure 2.9 Hybrid bus and star network
Note that the handling of packets in hybrid networks will depend on which of the above topologies are used to make up the hybrid structure. One of the typical applications of hybrid networks is illustrated by the following example, involving three hotel chains, A, B and C. Suppose hotel chain A uses a bus network, hotel chain B uses a star network and hotel chain C uses a mesh network. At some point, all three hotel chains are taken over by another company. By using hybrid network technology, all three hotel chains can be connected together even though they are each using a different type of network. The system can also be expanded easily without affecting any of the existing hotels using the network. There are many other examples; you might want to explore the various applications for each type of network topology.
2.1.4 Public and private cloud computing Cloud storage is a method of data storage where data is stored on offsite servers – the physical storage covers hundreds of servers in many locations. 39
457591_02_CI_AS & A_Level_CS_027-067.indd 39
4/30/19 7:45 AM
The same data is stored on more than one server in case of maintenance or repair, allowing clients to access data at any time. This is known as data redundancy. The physical environment is owned and managed by a hosting company.
2
There are three common systems, public cloud, private cloud and hybrid cloud. Public cloud is a storage environment where the customer/client and cloud storage provider are different companies. 2 Communication
Private cloud is storage provided by a dedicated environment behind a company firewall. Customer/client and cloud storage provider are integrated and operate as a single entity. Hybrid cloud is a combination of private and public clouds. Some data resides in the private cloud and less sensitive/less commercial data can be accessed from a public cloud storage provider. Instead of saving data on a local hard disk or other storage device, a user can save their data ‘in the cloud’. The pros and cons of using cloud storage are shown in Table 2.3. Pros of using cloud storage
Cons of using cloud storage
n
n
n
n n n
customer/client files stored on the cloud can be accessed at any time from any device anywhere in the world provided internet access is available no need for a customer/client to carry an external storage device with them, or use the same computer to store and retrieve information provides the user with remote back-up of data to aid data loss and disaster recovery recovers data if a customer/client has a hard disk or back-up device failure offers almost unlimited storage capacity
if the customer/client has a slow or unstable internet connection, they would have problems accessing or downloading their data/files n costs can be high if large storage capacity is required n expensive to pay for high download/upload data transfer limits with the customer/client internet service provider (ISP) n potential failure of the cloud storage company is possible – this poses a risk of loss of all back-up data
▲ Table 2.3 Summary of pros and cons of using cloud storage
Data security when using cloud storage Companies that transfer vast amounts of confidential data from their own systems to a cloud service provider are effectively relinquishing control of their own data security. This raises a number of questions: » What physical security exists regarding the building where the data is housed? » How good is the cloud service provider’s resistance to natural disasters or power cuts? » What safeguards exist regarding personnel who work for the cloud service company? Can they use their authorisation codes to access confidential data for monetary purposes?
Potential data loss when using cloud storage There is a risk that important and irreplaceable data could be lost from the cloud storage facilities. Actions from hackers (gaining access to accounts or pharming attacks, for example) could lead to loss or corruption of data. Users need to be certain sufficient safeguards exist to overcome these risks. The following breaches of security involving some of the largest cloud service providers suggest why some people are nervous of using cloud storage for important files:
40
457591_02_CI_AS & A_Level_CS_027-067.indd 40
4/30/19 7:45 AM
2 2.1 Networking
» The XEN security threat, which forced several cloud operators to reboot all their cloud servers, was caused by a problem in the XEN hypervisor (a hypervisor is a piece of computer software, firmware or hardware that creates and runs virtual machines). » A large cloud service provider permanently lost data during a routine back-up procedure. » The celebrity photos cloud hacking scandal, in which more than 100 private photos of celebrities were leaked. Hackers had gained access to a number of cloud accounts, which then enabled them to publish the photos on social networks and sell them to publishing companies. » In 2016, the National Electoral Institute of Mexico suffered a cloud security breach in which 93 million voter registrations, stored on a central database, were compromised and became publicly available to everyone. To make matters worse, much of the information on this database was also linked to an Amazon cloud server outside Mexico.
Cloud software Cloud storage is, of course, only one aspect of cloud computing. Other areas covered by cloud computing include databases, networking, software and analytical services using the internet. Here we will consider cloud software – you can research for yourself how databases and analytical services are provided by cloud computing services. Software applications can be delivered to a user’s computer on demand using cloud computing services. The cloud provider will both host and manage software applications – this will include maintenance, software upgrades and security for a monthly fee. A user will simply connect to the internet (using their web browser on a computer or tablet or mobile phone) and contact their cloud services supplier. The cloud services supplier will connect them to the software application they require. The main advantages are that the software will be fully tested and it does not need to reside on the user’s device. However, the user can still use the software even if the internet connection is lost. Data will simply be stored on the local device and then data will be uploaded or downloaded once the internet connection is restored. Cloud-based applications can, therefore, perform tasks on a local device. This makes them fundamentally different to web-based apps which need an internet connection at all times.
2.1.5 Wired and wireless networking Wireless Wi-Fi and Bluetooth Both Wi-Fi and Bluetooth offer wireless communication between devices. They both use electromagnetic radiation as the carrier of data transmission. Bluetooth sends and receives radio waves in a band of 79 different frequencies (known as channels). These are all centred on a 2.45 GHz frequency. Devices using Bluetooth automatically detect and connect to each other, but they do not interfere with other devices since each communicating pair uses a different channel (from the 79 options). When a device wants to communicate, it picks one of the 79 channels at random. If the channel is already being used, it randomly picks another 41
457591_02_CI_AS & A_Level_CS_027-067.indd 41
4/30/19 7:45 AM
channel. This is known as spread spectrum frequency hopping. To further minimise the risks of interference with other devices, the communication pairs constantly change the frequencies (channels) they are using (several times a second). Bluetooth creates a secure wireless personal area network (WPAN) based on key encryption.
2
Bluetooth is useful when
2 Communication
» transferring data between two or more devices which are less than 30 metres apart » the speed of data transmission is not critical » using low bandwidth applications (for example, sending music files from a mobile phone to a headset).
As mentioned earlier in the chapter, Wi-Fi also uses spread spectrum technology. However, Wi-Fi is best suited to operating full-scale networks, since it offers much faster data transfer rates, better range and better security than Bluetooth. A Wi-Fi-enabled device (such as a computer or smart phone) can access, for example, the internet wirelessly at any wireless access point (WAP) or ‘hot spot’ up to 100 metres away. As mentioned, wireless connectivity uses electromagnetic radiation: radio waves, microwaves or infrared. The scale of frequency and wavelength of magnetic radiation is shown in Table 2.4. radio waves
microwaves
infrared
visible light
ultra violet
X-rays
gamma rays
Wave length (m)
102
10 −1
10 −3
10 −5
10 −7
10 −9
10 −11
Frequency (Hz)
3 MHz
3 GHz
300 GHz
30 THz
3 PHz
300 PHz
30 EHz
▲ Table 2.4 Frequency and wavelength of magnetic radiation
EXTENSION ACTIVITY 2B Frequency and wavelength are linked by the equation: f= c λ where f = frequency (m), λ = wavelength (Hz), and c = velocity of light (3 × 108 m/s). Confirm the frequency values in Table 2.3 using the wavelengths given.
Table 2.5 compares radio waves, microwaves and infrared. (Please note: the ‘>’ symbol in the table means ‘better than’). Bandwidth
infrared > microwaves > radio waves (infrared has the largest bandwidth)
Penetration
radio waves > microwaves > infrared (radio waves have the best penetration)
Attenuation
radio waves > microwaves > infrared (radio waves have the best attenuation)
▲ Table 2.5 Comparison of radio waves, microwaves and infrared
42
457591_02_CI_AS & A_Level_CS_027-067.indd 42
4/30/19 7:45 AM
Additional notes on the use of satellites The use of microwaves and radio waves was previously mentioned as a method for allowing Wi-Fi connectivity in networks. These methods are perfectly satisfactory for short distances – the electromagnetic waves carry the signals – but the curvature of the Earth prevents such methods transmitting data globally. B
A
2 2.1 Networking
Penetration measures the ability of the electromagnetic radiation to pass through different media. Attenuation is the reduction in amplitude of a signal (infrared has low attenuation because it can be affected by, for example, rain or internal walls). Thus, we would expect infrared to be suitable for indoor use only; the fact that it can be stopped by walls is seen as an advantage since this stops the signal causing interference elsewhere. Microwaves seem to offer the best compromise, since they support reasonable bandwidth, and have reasonable penetration and attenuation.
The electromagnetic radiation from antenna A is transmitted but is unable to reach antenna B due to the Earth’s curvature.
▲ Figure 2.10
To overcome this problem, we need to adopt satellite technology: The signal is beamed from antenna A to a satellite orbiting Earth. A
The signal is boosted by the satellite orbiting Earth and is then beamed back to Earth and picked up by antenna B. B
▲ Figure 2.11
The communication between antennae and satellite is carried out by radio waves or microwave frequencies. Different frequency bands are used to prevent signal interference and to allow networks spread across the Earth to communicate through use of satellites (many satellites orbit the Earth – refer to Section 2.2.2 for more information on use of satellite technology with networks).
Wired There are three main types of cable used in wired networks (see Figure 2.12).
43
457591_02_CI_AS & A_Level_CS_027-067.indd 43
4/30/19 7:45 AM
2
optical fibres 1 2
conductor
outside insulation
insulation
insulator
pairs
Aramid strength yarns
2 Communication
3 4
flexible buffer tube water blocking binders ripcord
cable jacket
copper mesh
copper wire
jacket
▲ Figure 2.12 (left to right) Twisted pair cable, coaxial cable, fibre optic cable
Twisted pair cables Twisted pair cables are the most common cable type used in LANs. However, of the three types of cable, it has the lowest data transfer rate and suffers the most from external interference (such as electromagnetic radiation). However, it is the cheapest option. There are two types of twisted pair cable: unshielded and shielded. Unshielded is used by residential users. Shielded is used commercially (the cable contains a thin metal foil jacket which cancels out some of the external interference). Coaxial cables Coaxial cables are the most commonly used cables in MANs and by cable television companies. The cost of coaxial cables is higher than twisted pair cables but they offer a better data transfer rate and are affected less by external interference. Coaxial cables also have about 80 times the transmission capacity of twisted pair. Coaxial suffers from the greatest signal attenuation, but offers the best anti-jamming capabilities. Fibre optic cables Fibre optic cables are most commonly used to send data over long distances, because they offer the best data transfer rate, the smallest signal attenuation and have a very high resistance to external interference. The main drawback is the high cost. Unlike the other two types of cable, fibre optics use pulses of light rather than pulses of electricity to transmit data. They have about 26 000 times the transmission capacity of twisted pair cables. Fibre optic cables can be single- or multi-mode. Single-mode uses a single mode light source and has a smaller central core, which results in less light reflection along the cable. This allows the data to travel faster and further, making them a good choice for CATV and telecommunications. Multi core allows for a multi-mode light source; the construction causes higher light reflections in the core, so they work best over shorter distances (in a LAN, for example).
Wired versus wireless Numerous factors should be considered when deciding if a network should use wired or wireless connectivity, as listed below. 44
457591_02_CI_AS & A_Level_CS_027-067.indd 44
4/30/19 7:45 AM
Wired networking » More reliable and stable network (wireless connectivity is often subjected to interference). » Data transfer rates tend to be faster with no ‘dead spots’. » Tends to be cheaper overall, in spite of the need to buy and install cable. » Devices are not mobile; they must be close enough to allow for cable connections. » Lots of wires can lead to tripping hazards, overheating of connections (potential fire risk) and disconnection of cables during routine office cleaning.
2 2.1 Networking
Wireless networking » It is easier to expand networks and is not necessary to connect devices using cables. » Devices have increased mobility, provided they are within range of the WAPs. » Increased chance of interference from external sources. » Data is less secure than with wired systems; it is easier to intercept radio waves and microwaves than cables so it is essential to protect data transmissions using encryption (such as WEP, WPA2). » Data transmission rate is slower than wired networks (although it is improving). » Signals can be stopped by thick walls (in old houses, for example) and signal strength can vary, or ‘drop out’.
Other considerations » If mobile phones and tablets are connected to the network, it will need to offer Wi-Fi or Bluetooth capability. » There may be regulations in some countries regarding which wireless transmission frequencies can be used legally. » Permission from authorities and land owners may be required before laying cables underground. » There are numerous competing signals in the air around us; it is important to consider this when deciding whether to go for wired or wireless connectivity.
2.1.6 Hardware requirements of networks In this section we will consider a number of hardware items needed to form a LAN network and the hardware needed to form a WAN. Please note » the concept of the WLAN and the hardware needed to support it have been covered in earlier sections » the hardware items hub and gateway have been included in this section to complete the picture; however, knowledge of these two items is not required by the syllabus.
Hub Hubs are hardware devices that can have a number of devices or computers connected to them. computer computer
data packet sent to network HUB
computer
data sent out to all computers on the network
computer
▲ Figure 2.13 Hub flow diagram 45
457591_02_CI_AS & A_Level_CS_027-067.indd 45
4/30/19 7:45 AM
2 Communication
2
They are often used to connect a number of devices to form a local area network (LAN), for example a star network (see Section 2.1.3). A hub’s main task is to take any data packet (a group of data being transmitted) received at one of its ports and then send the data to every computer in the network. Using hubs is not a very secure method of data distribution and is also wasteful of bandwidth. Note that hubs can be wired or wireless devices.
Switch Switches are similar to hubs, but are more efficient in the way they distribute the data packet. As with hubs, they connect a number of devices or computers together to form a LAN (for example, a star network). However, unlike a hub, the switch checks the data packet received and works out its destination address (or addresses) and sends the data to the appropriate computer(s) only. This makes using a switch a more secure and efficient way of distributing data. computer computer
data packet sent to network SWITCH
computer
data sent out only to the appropriate computers on the network
computer
▲ Figure 2.14 Switch flow diagram
Each device or computer on a network has a media access control (MAC) address which identifies it uniquely. Data packets sent to switches will have a MAC address identifying the source of the data and additional addresses identifying each device which should receive the data. Note that switches can be wired or wireless devices.
Repeater When signals are sent over long distances, they suffer attenuation or signal loss. Repeaters are devices which are added to transmission systems to boost the signal so it can travel greater distances. They amplify signals on both analogue (copper cable) and digital (fibre optic cable) communication links. Repeaters can also be used on wireless systems. These are used to boost signals to prevent any ‘dead spots’ in the Wi-Fi zone. These devices plug into electric wall sockets and send out booster signals. They are termed non-logical devices because they will boost all signals which have been detected; they are not selective. Sometimes, hubs contain repeaters and are known as repeating hubs. All signals fed to the hub are boosted before being sent to all devices in the network, thus increasing the operational range. There are two main drawbacks of repeating hubs:
1 They have only one collision domain. When the signals are boosted and then broadcast to devices, any collisions which might occur are not resolved there and then. One way to deal with this problem is to make use of 46
457591_02_CI_AS & A_Level_CS_027-067.indd 46
4/30/19 7:45 AM
jamming signals – while this manages the collisions, it also reduces network performance since it involves repeated broadcasts as the collisions are resolved. 2 The devices are referred to as unmanaged since they are unable to manage delivery paths and also security in the network.
LAN
server computer
computer
LAN
server computer
SWITCH
SWITCH
computer
2.1 Networking
Bridge Bridges are devices that connect one LAN to another LAN that uses the same protocol (communication rules). They are often used to connect together different parts of a LAN so that they can function as a single LAN.
2
computer
computer
BRIDGE
▲ Figure 2.15 Bridge flow diagram
Bridges are used to interconnect LANs (or parts of LANs), since sending out every data packet to all possible destinations would quickly flood larger networks with unnecessary traffic. For this reason, a router is used to communicate with other networks, such as the internet. Note that bridges can be wired or wireless devices.
Router Routers enable data packets to be routed between the different networks for example, to join a LAN to a WAN. The router takes data transmitted in one format from a network (which is using a particular protocol) and converts the data to a protocol and format understood by another network, thereby allowing them to communicate via the router. We can, therefore, summarise the role of routers as follows. Routers » restrict broadcasts to a LAN » act as a default gateway » can perform protocol translation; for example, allowing a wired network to communicate with a wireless (Wi-Fi) network – the router can take an Ethernet data packet, remove the Ethernet part and put the IP address into a frame recognised by the wireless protocol (in other words, it is performing a protocol conversion) » can move data between networks » can calculate the best route to a network destination address. 47
457591_02_CI_AS & A_Level_CS_027-067.indd 47
4/30/19 7:45 AM
2
server
internet
computer
computer
SWITCH
2 Communication
ROUTER computer
LAN
LAN or WAN
▲ Figure 2.16 Router flow diagram
Broadband routers sit behind a firewall. The firewall protects the computers on a network. The router’s main function is to transmit internet and transmission protocols between two networks and allow private networks to be connected. The router inspects the data package sent to it from any computer on any of the networks connected to it. Since every computer on the same network has the same part of an internet protocol (IP) address, the router is able to send the data packet to the appropriate switch and it will then be delivered using the MAC destination address (see next section). If the MAC address doesn’t match any device on the network, it passes on to another switch on the same network until the appropriate device is found. Routers can be wired or wireless devices.
Gateway A gateway is a network point (or node) that acts as an entrance to another network. It is a key point for data on its way to or from other networks. It can be used to connect two or more dissimilar LANs (LANs using different protocols). The gateway converts data packets from one protocol to another. Gateways can also act as routers, firewalls or servers – in other words, any device that allows traffic to flow in and out of the networks. Gateways can be wired or wireless devices. All networks have boundaries so that all communication within the network is conducted using devices such as switches or routers. If a network node needs to communicate outside its network, it needs to use a gateway.
Modems Modern computers work with digital data, whereas many of the public communication channels still only allow analogue data transmission. To allow the transmission of digital data over analogue communication channels we need to use a modem (modulator demodulator). This device converts digital data to analogue data. It also does the reverse and converts data received over the analogue network into digital data which can be understood by the computer. Wireless modems transmit data in a modulated form to allow several simultaneous wireless communications to take place without interfering with each other. A modem will connect to the public infrastructure (cable, telephone, fibre-optics or satellite) and will supply the user with a standard Ethernet output which allows connection to a router, thus enabling an internet connection to occur.
48
457591_02_CI_AS & A_Level_CS_027-067.indd 48
4/30/19 7:45 AM
laptop PC internet modem
router
2
smart phone tablet
▲ Figure 2.17 Wireless modem flow diagram
Another example of a modem is a softmodem (software modem), which uses minimal hardware and uses software that runs on the host computer. The computer’s resources (mainly the processor and RAM) replace the hardware of a conventional modem.
2.1 Networking
While the router will allow the creation of a network in a home, for example, the modem allows for the connection to the external networks (for example, the internet). Routers and modems can be combined into one unit; these devices have the electronics and software to provide both router and modem functions.
Table 2.6 shows the differences between routers and gateways. Routers
Gateways
n
forward packets of data from one network to another; routers read each incoming packet of data and decide where to forward the packet
n
convert one protocol (or data format) to another protocol (format) used in a different network
n
can route traffic from one network to another network
n
convert data packets from one protocol to another; they act as an entry and exit point to networks
n
can be used to join LANs together to form a WAN (sometimes called brouters) and also to connect a number of LANs to the internet
n
translate from one protocol to another
n
offer additional features such as dynamic routing (ability to forward data by different routes)
n
do not support dynamic routing
▲ Table 2.6 Differences between routers and gateways
EXTENSION ACTIVITY 2C Draw a diagram to show how a gateway could be used to connect together three LANs which are using different protocols. Include all the hardware devices and cables needed.
Network interface card (NIC) A network interface card (NIC) is needed to allow a device to connect to anetwork (such as the internet). It is usually part of the device hardware andfrequently contains the MAC address generated at the manufacturing stage. 49
457591_02_CI_AS & A_Level_CS_027-067.indd 49
4/30/19 7:45 AM
Wireless network interface card/controller (WNIC) Wireless network interface cards/controllers (WNICs) are the same as the more ordinary NICs, in that they are used to connect devices to the internet or other networks. They use an antenna to communicate with networks via microwaves and normally simply plug into a USB port or can be internal integrated circuit plug in.
2 Communication
2 ▲ Figure 2.18 Wireless network interface card/controller (WNIC)
As with usual NICs, they work on layers 1 and 2 of the OSI model (refer to Chapter 14 for more details). WNICs work in two modes. Infrastructure mode requires WAPs (wireless access points) and all the data is transferred using the WAP and hub/switch; all the wireless devices connect to the WAP and must use the same security and authentication techniques. Ad hoc mode does not need to have access to WAPs; it is possible for devices to interface with each other directly.
2.1.7 Ethernet Ethernet is a protocol used by many wired LANs. It was adopted as a standard by the Institute of Electrical and Electronic Engineers (IEEE) and Ethernet is also known as IEEE 802.3. A network using Ethernet is made upof: » a node (any device on the LAN) » medium (path used by the LAN devices, such as an Ethernet cable) » frame (data is transmitted in frames which are made up of source address and destination address – the addresses are often the MAC address).
Conflicts When using Ethernet, it is possible for IP addresses to conflict; this could show up as a warning such as that in Figure 2.19.
▲ Figure 2.19 IP address conflict error
This may occur if devices on the same network have been given the same IP address; without a unique IP address it is not possible to connect to a network. This is most likely to occur on a LAN where dynamic IP addresses may have been used. Dynamic IP addresses are temporary and may have been assigned to a device on the network, unfortunately, another device using static IP addresses may already have the same IP address. This can be resolved by re-starting the router. Any dynamic IP addresses will be re-assigned, which could resolve the issue.
Collisions Ethernet supports broadcast transmission (communications where pieces of data are sent from sender to receiver) and are used to send messages to all devices connected to a LAN. The risk is that two messages using the same data
50
457591_02_CI_AS & A_Level_CS_027-067.indd 50
4/30/19 7:45 AM
channel could be sent at the same time, leading to a collision. Carrier sense multiple access with collision detection (CSMA/CD) was developed to try and resolve this issue. Collison detection depends on simple physics: when a frame is sent it causes a voltage change on the Ethernet cable. When a collision is detected, a node stops transmitting a frame and transmits a ‘jam’ signal and then waits for a random time interval before trying to resend the frame. CSMA/CD protocol will define the random time period for a device to wait before trying again.
A
assemble frame
is line idle?
No
2.1 Networking
Figure 2.20 shows how data collisions can be dealt with using transmission counters (which keep track of how many times the collision detection routine has been entered – there will a defined limit as part of the CSMA/CD protocol) and random time periods.
2
wait for allocated time Yes
Yes No
start to send frame
another frame?
END Yes
set transmission counter = 1
frame sent?
No
collision detected?
No
continue to send
Yes stop transmission and send jam signal
increment transmission counter
max transmission counter? No
Yes
abort transmission
A
wait for allocated time period then re-start transmission
▲ Figure 2.20 How data collisions can be dealt with using transmission counters 51
457591_02_CI_AS & A_Level_CS_027-067.indd 51
4/30/19 7:45 AM
2
EXTENSION ACTIVITY 2D Review Figure 2.20. As it stands, it is possible for an endless loop to be established.
2 Communication
Suggest a modification to the flow diagram to ensure it terminates if there is a problem with the data channel, or to prevent the data transmission holding up the computer for an unacceptable time period.
2.1.8 Bit streaming Bit streaming is a contiguous sequence of digital bits sent over the internet or a network that requires a high speed data communication link (such as fast broadband). Since bit streaming often involves very large files (such as video) it is necessary for the files to undergo some data compression before transmission. It is also necessary to have some form of buffering to ensure smooth playback of the media files. The data transmission rate from the file server (containing the video, for example) to the buffer must be greater than the rate at which data is transmitted from buffer to media player. The larger the buffer, the better the control over the bit rate being sent to the media player. The media player will always check to ensure data lies between a minimum value (often referred to as low water mark) and a maximum value (often referred to as a high water mark). The difference between the two values is usually about 80% of the total buffer capacity. The buffer is a temporary storage area of the computer.
source of data stream
bit streaming
low
from server
high buffer
media player
▲ Figure 2.21 Bit streaming
Table 2.7 shows the pros and cons of bit streaming. Pros of bit streaming
Cons of bit streaming
n
n
n n n n
no need to wait for a whole video or music file to be downloaded before the user can watch or listen no need to store large files on your device allows video files and music files to be played on demand (as required) no need for any specialist hardware affords piracy protection (more difficult to copy streamed files than files stored on a hard drive)
n
n n n
cannot stream video or music files if broadband connection is lost video or music files will pause to allow the data being streamed to ‘catch up’ if there is insufficient buffer capacity or slow broadband connection streaming uses up a lot of bandwidth security risks associated with downloading files from the internet copyright issues
▲ Table 2.7 Pros and cons of bit streaming
52
457591_02_CI_AS & A_Level_CS_027-067.indd 52
4/30/19 7:45 AM
Bit streaming can be either on demand or real time.
Real time » An event is captured by camera and microphone and is sent to a computer. » The video signal is converted (encoded) to a streaming media file. » The encoded file is uploaded from the computer to the dedicated video streaming server. » The server sends the encoded live video to the user’s device. » Since the video footage is live it is not possible to pause, rewind or fast forward.
2 2.1 Networking
On demand » Digital files stored on a server are converted to a bit streaming format (encoding takes place and the encoded files are uploaded to a server). » A link to the encoded video/music file is placed on the web server to be downloaded. » The user clicks on the link and the video/music file is downloaded in a contiguous bit stream. » Because it is on demand, the streamed video/music is broadcast to the user as and when required. » It is possible to pause, rewind and fast forward the video/music if required.
ACTIVITY 2B 1 a) Explain the differences between LAN, MAN and WAN. b) Give three of the benefits of networking computers. c) Explain the following terms. i) Thick client ii) Thin client 2 a) Draw diagrams to show the following network topologies. i) Bus ii) Star iii) Mesh b) Give one benefit and one drawback of using each type of network topology. 3 a) Explain the differences between public and private cloud computing. b) Give two benefits of using cloud computing. c) Give two drawbacks of using cloud computing. 4 You have been asked by a manager to write a report on whether a LAN being set up in their new building should use wired or wireless connectivity. The building has 20 floors. Explain your arguments for and against using both types of connectivity and draw a conclusion to help the manager make their decision. 5 a) What is meant by bit streaming? b) Why is it necessary to use buffers whilst streaming a video from the internet? c) Explain the differences between on demand and real time bit streaming.
53
457591_02_CI_AS & A_Level_CS_027-067.indd 53
4/30/19 7:45 AM
2
2.2 The internet Key terms
2 Communication
Internet – massive network of networks, made up of computers and other electronic devices; uses TCP/IP communication protocols. World Wide Web (WWW) – collection of multimedia web pages stored on a website, which uses the internet to access information from servers and other computers. HyperText Mark-up Language (HTML)– used to design web pages and to write http(s) protocols, for example. Uniform resource locator (URL) – specifies location of a web page (for example, www.hoddereducation.co.uk).
IPv4 – IP address format which uses 32 bits, such as 200.21.100.6. Classless inter-domain routing (CIDR)– increases IPv4 flexibility by adding a suffix to the IP address, such as 200.21.100.6/18. IPv6 – newer IP address format which uses 128 bits, such as A8F0:7FFF:F0F1:F000:3DD0: 256A:22FF:AA00. Zero compression – way of reducing the length of an IPv6 address by replacing groups of zeroes by a double colon (::); this can only be applied once to an address to avoid ambiguity.
Web browser – software that connects to DNS to locate IP addresses; interprets web pages sent to a user’s computer so that documents and multimedia can be read or watched/listened to.
Sub-netting – practice of dividing networks into two or more sub-networks.
Internet service provider (ISP) – company which allows a user to connect to the internet. They will usually charge a monthly fee for the service they provide.
Public IP address – an IP address allocated by the user’s ISP to identify the location of their device on the internet.
Private IP address – an IP address reserved for internal network use behind a router.
Voice over Internet Protocol (VoIP)– converts voice and webcam images into digital packages to be sent over the internet.
Domain name service (DNS) – (also known as domain name system) gives domain names for internet hosts and is a system for finding IP addresses of a domain name. JavaScript® – object-orientated (or scripting) programming language used mainly on the web to enhance HTML pages.
Internet protocol (IP) – uses IPv4 or IPv6 to give addresses to devices connected to the internet.
PHP – hypertext processor; an HTMLembedded scripting language used to write web pages.
Public switched telephone network (PSTN) – network used by traditional telephones when making calls or when sending faxes.
2.2.1 The differences between the internet and the World Wide Web There are fundamental differences between the internet and the World Wide Web (WWW).
Internet » The internet is a massive network of networks (although, as explained in Section 2.1.1, the internet is not a WAN) which are made up of various computers and other electronic devices. » It stands for interconnected network. » The internet makes use of transmission control protocol (TCP)/internet protocol (IP). 54
457591_02_CI_AS & A_Level_CS_027-067.indd 54
4/30/19 7:45 AM
World Wide Web (WWW) » This is a collection of multimedia web pages and other documents which are stored on websites. » http(s) protocols are written using HyperText Mark-up Language (HTML). » Uniform resource locators (URLs) specify the location of all web pages. » Web resources are accessed by web browsers. » The world wide web uses the internet to access information from servers and other computers.
The fundamental requirements for connecting to the internet are » a device (such as a computer, tablet or mobile phone) » a telephone line connection or a mobile phone network connection (however, it is possible that a tablet or mobile phone may connect to the internet using a wireless router) » a router (which can be wired or wireless) or router and modem » an internet service provider (ISP) (combination of hardware and software) » a web browser.
2.2 The internet
2.2.2 Hardware and software needed to support the internet
2
The telephone network system, public switched telephone network (PSTN), is used to connect computers/devices and LANs between towns and cities. Satellite technology is used to connect to other countries (see later). In recent years, telephone lines have changed from copper cables to fibre optic cables, which permits greater bandwidth and faster data transfer rates (and less risk of data corruption from interference). Fibre optic telephone networks are usually identified as ‘fast broadband’. As discussed earlier, high speed broadband has allowed WLANs to be developed by using WAPs. High speed communication links allow telephone and video calls to be made using a computer and the internet. Telephone calls require either an internetenabled telephone connected to a computer (using a USB port) or external/ internal microphone and speakers. Video calls also require a webcam. When using the internet to make a phone call, the user’s voice is converted to digital packages using Voice over Internet Protocol (VoIP). Data is split into packages (packet switching) and sent over the network via the fastest route. Packet switching and circuit switching are covered in more detail in Chapter 14.
Comparison between PSTN and internet when making a phone call Public switched telephone network (PSTN) PSTN uses a standard telephone connected to a telephone line. The telephone line connection is always open whether or not anybody is talking– the link is not terminated until the receivers are replaced by both parties. Telephone lines remain active even during a power cut; they have their own power source. Modern phones are digitised systems and use fibre optic cables (although because of the way it works this is a big waste of capacity – a 10 minute phone call will transmit about 10 MB of data). Existing phone lines use circuit switching (when a phone call is made the connection (circuit) is maintained throughout the duration of the call – this is the basis of PSTN). 55
457591_02_CI_AS & A_Level_CS_027-067.indd 55
4/30/19 7:45 AM
2
Phone calls using the internet Phone calls using the internet use either an internet phone or microphone and speakers (video calls also require a webcam). The internet connection is only ‘live’ while data (sound/video image) is being transmitted.
2 Communication
Voice over Internet Protocol (VoIP) converts sound to digital packages (encoding) which can be sent over the internet. VoIP uses packet switching; the networks simply send and retrieve data as it is needed so there is no dedicated line, unlike PSTN. Data is routed through thousands of possible pathways, allowing the fastest route to be determined. The conversation (data) is split into data packages. Each packet contains at least the sender’s address, receiver’s address and order number of packet – the sending computer sends the data to its router which sends the packets to another router, and so on. At the receiving end, the packets are reassembled into the original state (see Chapter 14 for more details). VoIP also carries out file compression to reduce the amount of data being transmitted. Because the link only exists while data is being transmitted, a typical 10 minute phone call may only contain about 3 minutes where people are talking; thus only 3 MB of data is transmitted making it much more efficient than PSTN. Cellular networks and satellites Other devices, such as mobile phones, use the cellular network. Here, the mobile phone providers act as the ISPs and the phones contain communication software which allows them to access the telephone network and also permits them to make an internet connection. Satellites are an important part of all network communications that cover vast distances. Due to the curvature of the Earth, the height of the satellite’s orbit determines how much coverage it can give. Figure 2.22 shows how satellites are classified according to how high they orbit in relation to the Earth’s surface.
35 800 km 5000–12 000 km 500–2500 km
GEO MEO LEO
diagram not to scale
Geostationary Earth Orbit (GEO) provide long distance telephone and computer network communications; orbital period = 24 hours Medium Earth Orbit (MEO) used for GPS systems (about 10 MEO satellites are currently orbiting the Earth); orbital period = 2 to 12 hours Low Earth Orbit (LEO) used by the mobile phone networks (there are currently more than 100 LEO satellites orbiting the Earth); orbital period = 80 mins to 2 hours
▲ Figure 2.22 Satellite classification 56
457591_02_CI_AS & A_Level_CS_027-067.indd 56
4/30/19 7:45 AM
Satellites have the advantage that they will always give complete coverage and don’t suffer from signal attenuation to the same extent as underground/ undersea cables. It is also difficult to isolate and resolve faults in cables on the sea bed.
2
2.2.3 IP addresses
Internet protocols (IP) IPv4 addressing The most common type of addressing on the internet is IP version 4 (IPv4). This is based on 32 bits giving 232 (4 294 967 296) possible addresses. The 32 bits are split into four groups of 8 bits (thus giving a range of 0 to 255). For example, 254.0.128.77.
2.2 The internet
The internet is based on TCP/IP protocols. Protocols define the rules that must be agreed by senders and receivers on the internet. Protocols can be divided into TCP layers (see Chapter 14). We will first consider internet protocols (IP).
The system uses the group of bits to define network (netID) and network host (hostID). The netID allows for initial transmission to be routed according to the netID and then the hostID is looked at by the receiving network. Networks are split into five different classes, as shown in Table 2.8 below. Network class
IPv4 range
Number of netID bits
A
0.0.0.0 to 127.255.255.255
B
Number of Types of hostID bits network
8
24
very large
128.0.0.0 to 191.255.255.255
16
16
medium size
C
192.0.0.0 to 223.255.255.255
24
8
small networks
D
224.0.0.0 to 239.255.255.255
–
–
multi-cast
E
240.0.0.0 to 255.255.255.255
–
–
experimental
▲ Table 2.8 The five network classes
Consider the class C network IP address 190.15.25.240, which would be written in binary as: 10111110 00001111 00011001 11110000 Here the network id is 190.15.25 and the host ID is 240. Consider the class B network IP address 128.148.12.14, which would be written in binary as: 10000000 10010100 00001100 00001110 Here the network ID is 128.148 and the host ID is 12.14 (made up of sub-net ID 12 and host ID of 14). Consider the class A network IP address 29.68.0.43, which would be written in binary as: 00011101 01000100 00000000 00101011 Here the network ID is 29 and the host ID is 68.0.43 (made up of sub-net ID 68.0 and host ID of 43). 57
457591_02_CI_AS & A_Level_CS_027-067.indd 57
4/30/19 7:45 AM
2 Communication
2
However, it soon became clear that this IPv4 system provides insufficient address range. For example, a user with a medium sized network (class B) might have 284 host machines and their class B licence allows them 216 (65534; note the value is not 65536 since two values are not assigned). This means several of the allocated host IDs will not be used, which is wasteful. Classless inter-domain routing (CIDR) reduces this problem by increasing the flexibility of the IPv4 system. A suffix is used, such as 192.30.250.00/18, which means 18 bits will be used for the net ID and the last 14 bits will be used for the host ID (rather than the normal 24 bits and 8 bits for a class C network). The suffix clearly increases the flexibility regarding which bits represent the net ID and which represent the host ID.
EXTENSION ACTIVITY 2E Network address translation (NAT) removes the need for each IP address to be unique. Find out how it works.
IPv6 addressing IPv6 addressing has been developed to overcome some of the problems associated with IPv4. This system uses 128-bit addressing, which allows for much more complex addressing structures. An IPv6 address is broken into 16-bit chunks and because of this, it adopts the hexadecimal notation. For example: A8FB:7A88:FFF0:0FFF:3D21:2085:66FB:F0FA Note how a colon (:) rather than a decimal point (.) is used here. It has been designed to allow the internet to grow in terms of number of hosts and the potential amount of data traffic. IPv6 has benefits over IPv4, it » » » »
has no need for NATs (network address translation) removes risk of private IP address collisions has built in authentication allows for more efficient routing.
Zero compression IPv6 addresses can be quite long; but there is a way to shorten them using zero compression. For example, 900B:3E4A:AE41:0000:0000:AFF7:DD44:F1FF can be written as: 900B:3E4A:AE41::AFF7:DD44:F1FF With the section 0000:0000 replaced by :: The zero compression can only be applied ONCE to an IPv6 address, otherwise it would be impossible to tell how many zeros were replaced on each occasion where it was applied. For example, 8055:F2F2:0000:0000:FFF1:0000:0000:DD04 can be rewritten either as: 8055:F2F2::FFF1:0000:0000:DD04 or as:
8055:F2F2:0000:0000:FFF1::DD04
58
457591_02_CI_AS & A_Level_CS_027-067.indd 58
4/30/19 7:45 AM
8055:F2F2::FFF1::DD04 is not a legal way of compressing the original address – we have no way of knowing whether the original address was 8055:F2F2:0000:FFF1:0000:0000:0000:DD04
2
or 8055:F2F2:0000:0000:0000:FFF1:0000:DD04 or It would, therefore, be regarded as ambiguous.
Sub-netting CIDR is actually based on sub-netting and the two are similar in many ways. Sub-netting divides a LAN into two or more smaller networks. This helps reduce network traffic and can also hide the complexity of the overall network. Recall that the IP address (using IPv4) is made up of the netID and hostID. Suppose a university network has eight departments and has a netID of 192.200.20 (11000000.11001000.00010100). All of the devices on the university network will be associated with this netID and can have hostID values from 00000001 to 1111110 (hostIDs containing all 0s or all 1s are forbidden). The university network will look something like this:
2.2 The internet
8055:F2F2:0000:0000:FFF1:0000:0000:DD04
Humanities Admin and finance
Maths
Science gateway
internet
Arts
Engineering
Business Computing
▲ Figure 2.23 An example of a university network
So, for example, the devices in the Admin and finance department might have hostIDs of 1, 8, 240, 35, 67, 88, 134, and so on, with similar spreads for the other seven departments. It would be beneficial to organise the netIDs and hostIDs so that the network was a lot less complex in nature. With sub-netting, the hostID is split as follows: 000 00000, where the first 3 bits are netID expansion and the last 5 bits are the hostIDs. 59
457591_02_CI_AS & A_Level_CS_027-067.indd 59
4/30/19 7:45 AM
2 Communication
2
Thus, we have eight sub-nets with the same range of hostIDs. Department
netID
hostID range
Admin and finance
192.200.20.0
00001 to 11110
Humanities
192.200.20.1
00001 to 11110
Maths
192.200.20.2
00001 to 11110
Science
192.200.20.3
00001 to 11110
Arts
192.200.20.4
00001 to 11110
Engineering
192.200.20.5
00001 to 11110
Computing
192.200.20.6
00001 to 11110
Business
192.200.20.7
00001 to 11110
▲ Table 2.9
Admin and finance 192.200.20.0
Humanities 192.200.20.1 Maths 192.200.20.2
Science 192.200.20.3 router
internet
Arts 192.200.20.4
Business 192.200.20.7
Engineering 192.200.20.5 Computing 192.200.20.6
▲ Figure 2.24 An example of a university network with netIDs
The devices in the Admin and finance department will have IP addresses 192.200.20.000 00001 to 192.200.20.000 11110 The Humanities department will have IP addresses 192.200.20.001 00001 to 192.200.20.001 11110 And so on for the other departments. To obtain the netID from the IP address we can apply the AND mask (recall that 1 AND 1 = 1, 0 AND 0 = 0 or 1 AND 0 = 0). Thus, if a device has an IP address of 11000000.11001000.00010100.011 00011 we can apply the AND mask 11111111.11111111.11111111.111 00000
which results in the netID value 11000000.11001000.00010100.011 00000 (or 192.200.20.03)
60
457591_02_CI_AS & A_Level_CS_027-067.indd 60
4/30/19 7:45 AM
This is the Science department. Consequently, the whole network is more efficient (for the reasons stated above) and less complex. Compare this to CIDR 192/200/20/0/27, which extends the size of the netID to 27 bits and has a hostID of only 5 bits, but would not reduce the complexity of the network.
2
Private IP addresses and public IP addresses Private IP addresses are reserved for internal use behind a router or other NAT device. The following blocks are reserved for private IP addresses. 10.0.0.0 to 10.255.255.255
16 million possible addresses
Class B
172.16.0.0 to 172.31.255.255
1 million possible addresses
Class C
192.168.0.0 to 192.168.255.255
65 600 possible addresses
▲ Table 2.10
Private IP addresses (which are internal value only) allow for an entirely separate set of addresses within a network. They allow access to the network without taking up a public IP address space. However, devices using these private IP addresses cannot be reached by internet users.
2.2 The internet
Class A
Public IP addresses are the ones allocated by a user’s ISP to identify the location of their device. Devices using these IP addresses are accessible from anybody using the internet. Public IP addresses are used by » DNS servers » network routers » directly-controlled computers.
2.2.4 Uniform resource service (URLs) Web browsers are software that allow users to access and display web pages on their screens. They interpret HTML sent from websites and display the results. Web browsers use uniform resource locators (URL) to access websites; these are represented by a set of four numbers, such as 109.108.158.1. But it is much easier to type this into a browser using the following format: protocol://website address/path/filename Protocol is usually http or https Website address is » » » »
domain host (www) domain name (name of website) domain type (.com, .org, .net, .gov, and so on) (sometimes) a country code (.uk, .de, .cy, .br, and so on).
Path is the web page (if this is omitted then it is the root directory of the website) Filename is the item from the web page For example: http://www.hoddereducation.co.uk/computerscience
2.2.5 Domain name service (DNS) The domain name service (DNS) (also known as domain name system) gives domain names for internet hosts and is a system for finding IP addresses of a domain name. Domain names eliminate the need for a user to memorise IP addresses. The DNS process involves converting a host name (such as www.hoddereducation.co.uk) into an IP address the computer can understand (such as 107.162.140.19). 61
457591_02_CI_AS & A_Level_CS_027-067.indd 61
4/30/19 7:45 AM
2
Often, DNS servers contain a database of URLs with the matching IPaddresses. DNS server (2)
2 Communication
3
2
computer
website server
5
4
1 DNS server (1)
▲ Figure 2.25 An example of the DNS process
① The user opens their web browser and types in the URL (www.hoddereducation.co.uk) and the web browser asks the DNS server (1) for the IP address of the website. ② The DNS server can’t find www.hoddereducation.co.uk in its database or its cache and sends out a request to DNS server (2). ③ DNS server (2) finds the URL and can map it to 107.162.140.19; the IP address is sent back to DNS server (1) which now puts the IP address and associated URL into its cache/database. ④ This IP address is then sent back to the user’s computer. ⑤ The computer now sets up a communication with the website server and the required pages are downloaded. The web browser interprets the HTML and displays the information on the user’s screen.
2.2.6 Scripting in HTML This section considers HTML scripting using JavaScript and PHP. While this extends beyond the syllabus, it is included here to help you understand how HTML is used to create websites and how web browsers communicate with servers. It is included here for information and to aid understanding. A user may wish to develop a web application, which is client-server based, on their own computer. To do this they would need to: » download the necessary server software » install the application on the chosen/allocated server » use the web browser on their computer to access and interpret the application web pages.
Each web page would need to be created using HTML. A domain name would have to be purchased from a web-hosting company. The HTML files would needto be uploaded to the server which was allocated to the user by the web-hosting company. 62
457591_02_CI_AS & A_Level_CS_027-067.indd 62
4/30/19 7:45 AM
HTML would be used to create a file using tags. For example:
Example
[program code]
JavaScript JavaScript (unlike HTML) is a programming language which will run on the client-side. What is the difference between running on the client-side and running on the server-side?
2.2 The internet
Between the HTML tags the inclusion of JavaScript or PHP can be used.
2
» Client-side – the script runs on the computer, which is making the request, processing the web page data that is being sent to the computer from the server. » Server-side – the script is run on the web server and the results of processing are then sent to the computer that made the request.
The following short program inputs a temperature and outputs ‘HIGH’ if it is 200 °C or over, ‘OK’ if it is 100 °C or over and ‘LOW’ if it is below 100 °C. 01
02
03
Enter the temperature
04
EXTENSION ACTIVITY 2F Look at the two pieces of code in the previous JavaScript and PHP sections, then answer these questions. a) Write down the names of two variables which are used in each piece ofcode. b) In each case, identify which statement(s) correspond(s) to an output. c) What is the purpose of the statement shown in line: i) 09 of the JavaScript code ii) 03 of the PHP code? e) What is the purpose of line 05 in the JavaScript?
ACTIVITY 2C
1 a) Describe what happens when a telephone call is made using PSTN. b) Describe what happens when a computer, equipped with microphone and speakers, is used to make a ‘telephone’ call over the internet. c) Communication links between continents frequently involve the use of satellite technology. Explain the differences between GEO, MEO and LEO satellites.
64
457591_02_CI_AS & A_Level_CS_027-067.indd 64
4/30/19 7:45 AM
End of chapter questions
2 2.2 The internet
2 a) Class A computer networks are identified by IP addresses starting with 0.0.0.0, class B computer networks are identified by IP addresses starting with 128.0.0.0 and class C computer networks are identified by IP addresses starting with 192.0.0.0. (Class D networks begin with 224.0.0.0.) Write these starting IP addresses in binary format. b) Using the data above, write down the upper IP addresses of the three network classes A, B and C. c) A device on a network has the IP address: 10111110 00001111 00011001 11110000 i) Which class of network is the device part of? ii) Which bits are used for the net ID and which bits are used for the host ID? iii) A network uses IP addresses of the form 200.35.254.25/18. Explain the significance of the appended value 18. d) Give two differences between IPv4 and IPv6. 3 a) Describe the differences between private IP addresses and public IP addresses. b) Identify the protocol, domain name and file name used in the following URL: https://www.exampleofaurl.co.de/computer_logic.html c) Describe how DNS is used to retrieve a web page from the website used in part b). 4 a) Explain the differences between the internet and the world wide web (www). b) Hasina wrote, ‘The internet is not necessarily a type of WAN.’ Is Hasina’s statement correct? Give reasons for your answer. c) Explain these two terms. i) Web browser ii) Internet service provider (ISP)
1 Star and mesh are two types of network topology that can be used to make a LAN.
Star network
Mesh network
a) i) State one benefit and one drawback of the star network topology. ii) State one benefit and one drawback of the mesh network topology.
[2] [2]
➔
65
457591_02_CI_AS & A_Level_CS_027-067.indd 65
4/30/19 7:45 AM
2
b) Copy the diagram below and connect each description to either a client-server or peer-to-peer network. [4] Type of network
Description
2 Communication
Connectivity is the most important aspect of this type of network Uses separate dedicated servers and specific workstations
Client-server
Has no central storage and doesn’t require authentication of users Sharing of data is the most important aspect of this type of network Has no central server; each workstation shares its files/data with the others
Peer-to-peer
Performance and management issues can occur if the number of workstations exceeds ten Once logged in, a user can only access resources that the network manager allows them to use More stable system since there is centralised backing up of files
2 a) Conventional telephone calls are made using the public service telephone network (PSTN). The national network uses both copper cables and fibre optic cables. i) Explain the difference between copper cabling and fibre optic cabling. [2] ii) Describe two benefits and two drawbacks of both types of cabling. [4] b) Satellite technology is often used in long distance communications. Compare the differences between GEO, MEO and LEO satellites. [3] c) Some telephones use Bluetooth to connect to the telephone network. Explain what is meant by: i) the attenuation of a signal [2] ii) spread spectrum frequency hopping. [2] 3 a) Explain the term bit streaming. [2] b) A person watches a film streamed from a website on a tablet computer. [2] i) Give two benefits of using bit streaming for this purpose. ii) State two potential problems of using bit streaming for this purpose. [2] c) Explain the terms on-demand bit streaming and real-time bit streaming. [4]
Cambridge International AS & A Level Computer Science 9608 Paper 11 Q1 November 2015 66
457591_02_CI_AS & A_Level_CS_027-067.indd 66
4/30/19 7:45 AM
Network device
Description
gateway
device that analyses packets of data transmitted from one network to another or analyses data within a single network
switch
network point (node) that connects two networks that use different protocols
hub
device that connects LANs that use the same protocol to allow them to work as a single network
router
device on a network that redirects data received to only those destinations on the LAN network that match the address in the data packet
bridge
device that sends all the received data packets to every device in the network irrespective of any data packet addresses
2 2.2 The internet
4 A buffer is 2 MiB in size. The lower limit of the buffer is set at 200 KiB and the higher limit is set at 1.8 MiB. Data is being streamed at 1.5 Mbps and the media player is taking data at the rate 600 kbps. You may assume a megabit is 1 048 576 bits and a kilobit is 1024 bits. a) Explain why the buffer is needed. [2] b) i) Calculate the amount of data stored in the buffer after 2 seconds of streaming and playback. You may assume that the buffer already contains 200 KiB of data. [4] ii) By using different time values (such as 4 secs, 6 secs, 8 secs, and so on) determine how long it will take before the buffer reaches its higher limit (1.8 MiB). [5] c) Describe how the problem calculated in part b) ii) can be overcome so that a 30-minute video can be watched without frequent pausing of playback. [2] 5 a) When data is transmitted over a LAN network there is the possible risk of data collision. i) Explain the term data collision. [2] ii) Describe how CSMA/CD is able to detect collisions. [1] iii) Explain how CSMA/CD can be used to resolve the problem of data collision. [2] b) Copy the diagram below and connect each network device to its description. [5]
67
457591_02_CI_AS & A_Level_CS_027-067.indd 67
4/30/19 7:45 AM
3
3 Hardware
Hardware In this chapter, you will learn about ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
primary storage/memory devices secondary storage (including removable devices) the benefits and drawbacks of embedded systems hardware devices used as input, output and storage the differences between RAM, ROM, SRAM, DRAM, PROM and EPROM the use of RAM, ROM, SRAM and DRAM in a range of devices monitoring and control systems the use of logic gates: NOT, AND, OR, NAND, NOR and XOR the construction and use of truth tables the construction of logic circuits, truth tables and logic expressions from a variety of logic information.
WHAT YOU SHOULD ALREADY KNOW Try these five questions before you read this chapter. 1 What is the difference between memory and storage? 2 Why is it necessary to have both internal and external memory/storage devices? 3 Can you recognise the memory/storage devices on the right? 4 What is the difference between online and offline storage? 5 What is the difference between data access time and data transfer rate when using memory and storage devices?
▲ Figure 3.1 Memory/storage devices
3.1 Computers and their components Key terms Memory cache – high speed memory external to processor which stores data which the processor will need again.
Dynamic RAM (DRAM) – type of RAM chip that needs to be constantly refreshed.
Random access memory (RAM) – primary memory unit that can be written to and read from.
Static RAM (SRAM) – type of RAM chip that uses flip-flops and does not need refreshing.
Read-only memory (ROM) – primary memory unit that can only be read from.
Refreshed – requirement to charge a component to retain its electronic state.
68
457591_03_CI_AS & A_Level_CS_068-106.indd 68
26/04/19 7:27 AM
Programmable ROM (PROM) – type of ROM chip that can be programmed once. Erasable PROM (EPROM) – type of ROM that can be programmed more than once using ultraviolet (UV) light.
Direct 3D printing – 3D printing technique where print head moves in the x, y and z directions. Layers of melted material are built up using nozzles like an inkjet printer.
Latency – the lag in a system; for example, the time to find a track on a hard disk, which depends on the time taken for the disk to rotate around to its read-write head.
Analogue to digital converter (ADC) – needed to convert analogue data (read from sensors, for example) into a form understood by a computer.
Fragmented – storage of data in non-consecutive sectors; for example, due to editing and deletion of old data.
Organic LED (OLED) – uses movement of electrons between cathode and anode to produce an on-screen image. It generates its own light so no back lighting required.
Removable hard disk drive – portable hard disk drive that is external to the computer; it can be connected via a USB part when required; often used as a device to back up files and data. Solid state drive (SSD) – storage media with no moving parts that relies on movement of electrons. Electronically erasable programmable read-only memory (EEPROM) – read-only (ROM) chip that can be modified by the user, which can then be erased and written to repeatedly using pulsed voltages. Flash memory – a type of EEPROM, particularly suited to use in drives such as SSDs, memory cards and memory sticks. Optical storage – CDs, DVDs and Blu-rayTM discs that use laser light to read and write data. Dual layering – used in DVDs; uses two recording layers. Birefringence – a reading problem with DVDs caused by refraction of laser light into two beams. Binder 3D printing – 3D printing method that uses a two-stage pass; the first stage uses dry powder and the second stage uses a binding agent.
Screen resolution – number of pixels in the horizontal and vertical directions on a television/computer screen. Touch screen – screen on which the touch of a finger or stylus allows selection or manipulation of a screen image; they usually use capacitive or resistive technology. Capacitive – type of touch screen technology based on glass layers forming a capacitor, where fingers touching the screen cause a change in the electric field. Resistive – type of touch screen technology. When a finger touches the screen, the glass layer touches the plastic layer, completing the circuit and causing a current to flow at that point.
3 3.1 Computers and their components
Hard disk drive (HDD) – type of magnetic storage device that uses spinning disks.
Digital to analogue converter (DAC) – needed to convert digital data into electric currents that can drive motors, actuators and relays, for example.
Virtual reality headset – apparatus worn on the head that covers the eyes like a pair of goggles. It gives the user the ‘feeling of being there’ by immersing them totally in the virtual reality experience. Sensor – input device that reads physical data from its surroundings.
3.1.1 Types of memory and storage Computers require some form of memory and storage. Memory is usually referred to as the internal devices which the computer can access directly. This memory can be the user’s workspace, temporary data or data that is key to running the computer. Storage devices allow users to store applications, data and files. The user’s data is stored permanently and they can change it or read it as they wish. Storage needs to be larger than internal memory since the user may wish to store large files (such as music files or photographic images). Storage devices can also be removable to allow data, for example, to be transferred between computers. Removable devices allow a user to store important data in a different building in case of data loss. However, all of this has become a lot less important with the advent of technology such as ‘data drop’ (which uses Bluetooth) and cloud storage. Internal memory includes components such as registers (which are part of the processor). There is also memory cache (which is external to the processor); this is used to store data which the processor will probably need to use again. 69
457591_03_CI_AS & A_Level_CS_068-106.indd 69
26/04/19 7:27 AM
3
Figure 3.2 summarises the types of memory and storage devices covered in this chapter. secondary storage hard disk drive (HDD) primary memory solid state drive (SSD)
3 Hardware
RAM
removable devices:
ROM
- DVD/CD/Blu-ray - flash memory stick - hard disk drive
▲ Figure 3.2 Memory and storage devices
Primary memory Primary memory is the part of computer memory which can be accessed directly from the CPU and, as Figure 3.2 shows, contains the random access memory (RAM) and read-only memory (ROM) memory chips. Primary memory allows the processor to access applications and services temporarily stored in memory locations. The structure of primary memory is shown in Figure 3.3. Primary memory
RAM SRAM
ROM DRAM
PROM
EPROM
EEPROM
▲ Figure 3.3 Structure of primary memory
All computer systems come with some form of RAM. These memory devices are not really random, it refers to the fact that any memory location can be accessed independent of which memory location was last used. Access time to locate data is much faster in RAM than in secondary devices. RAM can also be » written to or read from, and the data stored can be changed by the user or by the computer » used to store data, files, part of an application or part of the operating system currently in use » volatile (memory contents are lost on powering off the computer).
In general, the larger the RAM, the faster the computer will operate. In reality, RAM never runs out of memory, it continues to operate but just becomes slower and slower as more data is stored. As RAM becomes ‘full’, the processor has to continually access the secondary data storage devices to overwrite old data on RAM with new data. By increasing the RAM size, the number of times this has to be done is considerably reduced, thus making the computer operate more quickly. There are currently two types of RAM technology, dynamic RAM (DRAM) and static RAM (SRAM). 70
457591_03_CI_AS & A_Level_CS_068-106.indd 70
26/04/19 7:27 AM
Dynamic RAM (DRAM) Each DRAM chip consists of a number of transistors and capacitors. Each of these parts is tiny since a single RAM chip will contain millions of capacitors and transistors.
3
» Capacitors hold the bits of information (0 or 1). » Transistors act like switches; they allow the chip control circuitry to read the capacitor or change the capacitor’s value.
DRAMs have a number of advantages over SRAMs. They: » are much less expensive to manufacture than SRAMs » consume less power than SRAMs » have a higher memory capacity than SRAMs.
Static RAM (SRAM) A major difference between SRAM and DRAM is that SRAM does not need to be constantly refreshed.
3.1 Computers and their components
▲ Figure 3.4 Two pieces of dynamic random access memory (DRAM)
This type of RAM needs to be constantly refreshed (that is, the capacitor needs to be re-charged every 15 microseconds otherwise it would lose its value). If it is not refreshed, the capacitor’s charge will leak away very quickly, leaving every capacitor with the value 0.
It makes use of flip flops (see Chapter 15) which hold each bit of memory. SRAM is much faster than DRAM when it comes to data access (typically, access time for SRAM is 25 nanoseconds and for DRAM is 60 nanoseconds).
▲ Figure 3.5 Static RAM
DRAM is the most common type of RAM used in computers, but where absolute speed is essential, for example in the processor’s memory cache, SRAM is the preferred technology. Memory cache is a high speed portion of the memory. It is effective because most programs access the same data or instructions many times. By keeping as much of this information as possible in SRAM, the computer avoids having to access the slower DRAM. Table 3.1 summarises the differences between DRAM and SRAM. DRAM
SRAM
n
n
consists of a number of transistors and capacitors n needs to be constantly refreshed n less expensive to manufacture than SRAM n has a higher memory capacity than SRAM n main memory is constructed from DRAM n consumes more power than SRAM under reasonable levels of access, as it needs to be constantly refreshed
n n n n
uses flip-flops to hold each bit of memory does not need to be constantly refreshed has a faster data access time than DRAM processor memory cache makes use of SRAM if accessed at a high frequency, power usage can exceed that of DRAM
▲ Table 3.1 Differences between DRAM and SRAM
Another form of primary memory is the read-only memory (ROM). This is similar to RAM in that it shares the same random access properties, but it cannot be written to or changed. As the name suggests, ROM is a read-only memory device. ROMs are » non-volatile (the contents are not lost after powering off the computer) » permanent memory devices (the contents cannot be changed) 71
457591_03_CI_AS & A_Level_CS_068-106.indd 71
26/04/19 7:27 AM
3 Hardware
3
» often used to store data which the computer needs to access when powering up for the first time for example, the basic input/output system (BIOS).
Table 3.2 summarises the main differences between RAM and ROM. RAM
ROM
n
n
n
n
temporary memory device volatile memory n can be written to and read from n used to store data, files, programs, part of OS currently in use n can be increased in size to improve operational speed of a computer
permanent memory device non-volatile memory device n data stored cannot be altered n sometimes used to store BIOS and other data needed at start up
▲ Table 3.2 Differences between RAM and ROM
PROM and EPROM A programmable read-only memory (PROM) is a type of ROM chip that can be altered once. A PROM is made up of a matrix of fuses. Programming a PROM requires the use of a PROM writer which uses an electric current to alter specific cells by ‘burning’ fuses in the matrix. Due to the method of programming (writing), a PROM can only be written to once. They are often used in mobile phones and in RFID tags. An erasable programmable read-only memory (EPROM) is different to a PROM because they use floating gate transistors and capacitors rather than fuses. Ultra violet (UV) light is used to program an EPROM through a quartz window. They are used in applications which are under development, such as the programming of new games consoles. Embedded systems Embedded systems involve installing microprocessors into devices to enable operations to be controlled in a more efficient way. Devices such as cookers, refrigerators and central heating systems can now all be activated by a web-enabled device (such as a mobile phone or tablet). The time a central heating system switches on or off and the temperature can all be set from an app on a mobile phone from anywhere in the world. There are pros and cons of devices being controlled in this manner, as shown in Table 3.3. Pros of embedded systems
Cons of embedded systems
n
n
n n
n n n
small in size and therefore easy to fit into devices relatively low cost to make usually dedicated to one task, making for simple interfaces and often no requirement of an operating system consume very little power very fast reaction to changing input (operate in real time) with mass production comes reliability
n n
n
n
difficult to upgrade devices to take advantage of new technology troubleshooting faults in the device becomes a specialist task although the interface can appear to be simple, in reality it can be more confusing (changing the time on a cooker clock can require several steps, for example) any device that can be accessed over the internet is also open to hackers, viruses, and so on due to the difficulty in upgrading and fault finding, devices are often just thrown away rather than being repaired (wasteful)
▲ Table 3.3 Pros and cons of controlling devices with embedded systems 72
457591_03_CI_AS & A_Level_CS_068-106.indd 72
26/04/19 7:27 AM
EXTENSION ACTIVITY 3A Describe how ROM and RAM chips could be used in: a) a microwave oven b) a refrigerator c) a remote-controlled model aeroplane (the movement of the aeroplane is controlled by a hand-held device).
Hard disk drives (HDD) Hard disk drives (HDD) are still one of the most common methods used to store data on a computer.
3.1 Computers and their components
Secondary storage devices Secondary storage includes storage devices that are not directly accessible by the CPU. They are non-volatile devices which allow data to be stored as long as required by the user. This type of storage is much larger than primary memory, but data access time is considerably slower than RAM and ROM. All applications, the operating system, device drivers and general files (for example, documents, photos and music) are stored on secondary storage. The following section discusses the various types of secondary storage that can be found on the majority of computers. Secondary storage devices fall into three categories: magnetic, solid state and optical.
3
Data is stored in a digital format on the magnetic surfaces of the disks (or platters, as they are frequently called). The hard disk drive will have a number of platters which can spin at about 7000 times a second. A number of read-write heads can access all of the surfaces in the disk drive. Normally each platter will have two surfaces which can be used to store the data. These read-write heads can move very quickly – typically they can move from the centre of the disk to the edge of the disk (and back again) 50 times a second. Data is stored on the surface in sectors and tracks. A sector on a given track will contain a fixed number of bytes. track
sector
Unfortunately, hard disk drives have very slow data access when compared to, for example, RAM. Many applications require the read-write heads to constantly seek for the correct blocks of data; this means a large number of head movements. The effects of latency then become very significant. Latency is defined as the time it takes for a specific block of data on a data track to rotate around to the read-write head. Users will sometimes notice the effect of latency when they see messages such as, ‘Please wait’ or, at its worst, ‘not responding’.
▲ Figure 3.6 Tracks and sectors on a hard disk drive
When a file or data is stored on an HDD, the required number of sectors needed to store the data will be allocated. However, the sectors allocated may not be adjacent to each other. Through time, the HDD will undergo numerous deletions and editing, which leads to sectors becoming increasingly fragmented, resulting in a gradual deterioration of the HDD performance (in other words, it takes longer and longer to access data). Defragmentation software can improve on this situation by ‘tidying up’ the disk sectors. An HDD is a direct access device; however, data in a given sector will be read sequentially. 73
457591_03_CI_AS & A_Level_CS_068-106.indd 73
26/04/19 7:27 AM
3
Removable hard disk drives are essentially HDDs that are external to the computer and can be connected to the computer using one of the USB ports. In this way, they can be used as back-up devices or as another way of transferring files between computers.
3 Hardware
EXTENSION ACTIVITY 3B The length of a track on each disk in an HDD disk pack becomes much shorter towards the centre of the disk. Find out how manufacturers have overcome this issue with regards to disk data capacity and data access time.
Solid state drives (SSD) Latency is an issue in HDDs, as discussed earlier. Solid state drives (SSD) reduce this issue considerably. They have no moving parts and all data is retrieved at the same rate. They do not rely on magnetic properties. The most common type of solid state storage devices store data by controlling the movement of electrons within NAND chips. The data is stored as 0s and 1s in millions of tiny transistors (at each junction one transistor is called a floating gate and the other is called a control gate) within the chip. This effectively produces a non-volatile rewritable memory. However, a number of solid state storage devices sometimes use electronically erasable PROM (EEPROM) technology. The main difference is the use of NOR chips rather than NAND. This makes them faster in operation but devices using EEPROM are considerably more expensive than those that use NAND technology. EEPROM also allows data to be read or erased in single bytes at a time. Use of NAND only allows blocks of data to be read or erased. This makes EEPROM technology more useful in certain applications where data needs to be accessed or erased in byte-size chunks. Because of the cost implications, the majority of solid state storage devices use NAND technology. The two are usually distinguished by the terms flash memory (use NAND) and EEPROM (use NOR). So, what are the main benefits of using an SSD rather than an HDD? Solid state drives » » » » »
are more reliable (no moving parts to go wrong) are considerably lighter (which makes them suitable for laptops) do not have to ‘get up to speed’ before they work properly have a lower power consumption run much cooler than HDDs (both these points again make them very suitable for laptop computers) » are very thin (because they have no moving parts) » access data considerably faster. The main drawback of SSD is the still unknown longevity of the technology. Most solid state storage devices are conservatively rated at only 20 GB write operations per day over a three year period – this is known as SSD endurance. For this reason, SSD technology is not commonly used in servers, for example, where a huge number of write operations take place every day. However, this issue is being addressed by a number of manufacturers to improve the durability of these solid state systems and they are rapidly becoming more common in applications such as servers and cloud storage devices. Note that it is also not possible to over-write existing data on a flash memory device; it is necessary to first erase the old data and then write the new data at the same location. 74
457591_03_CI_AS & A_Level_CS_068-106.indd 74
26/04/19 7:27 AM
Memory sticks/flash memories (also known as pen drives) use solid state technology. They usually connect to the computer through the USB port. Their main advantage is that they are very small, lightweight devices which make them suitable for transferring files between computers. They can also be used as small back-up devices for music or photo files, for example.
Optical media: CDs, DVDs and Blu-ray discs CDs and DVDS are described as optical storage devices. Laser light is used to read data from, and write data onto, the surface of a disk. single spiral track runs from the centre to outer part of disk pits or bumps
3.1 Computers and their components
Complex or expensive software, such as an expert system, will often use a memory stick as a dongle. The dongle contains additional files which are needed to run the software. Without this dongle, the software will not work properly. It therefore prevents illegal or unauthorised use of the software, and also prevents copying of the software since, without the dongle, it is useless.
3
▲ Figure 3.7 CDs and DVDs use a single, spiral track
Both CDs and DVDs use a thin layer of metal alloy or light-sensitive organic dye to store the data. As shown in Figure 3.7, both systems use a single, spiral track which runs from the centre of the disk to the edge. When a disk spins, the optical head moves to the point where the laser beam ‘contacts’ the disk surface and follows the spiral track from the centre outwards. As with an HDD, a CD/DVD is divided into sectors allowing direct access of data. Also, as in the case of an HDD, the outer part of the disk runs faster than the inner part of the disk.
EXTENSION ACTIVITY 3C The outer part of an optical disk runs faster than the inner part of the disk. Find out how manufacturers have overcome this issue with regards to disk data capacity and data access time.
The data is stored in ‘pits’ and ‘bumps’ on the spiral track. A red laser is used to read and write the data. CDs and DVDs can be designated R (write once only) or RW (can be written to or read from many times). DVD technology is slightly different to that used in CDs. One of the main differences is the use of dual layering which considerably increases the storage capacity. This means that there are two individual recording layers. Two layers of a standard DVD are joined together with a transparent (polycarbonate) spacer, and a very thin reflector is sandwiched between the two layers. Reading and writing of the second layer is done by a red laser focusing at a fraction of a millimetre difference compared to the first layer. 75
457591_03_CI_AS & A_Level_CS_068-106.indd 75
26/04/19 7:27 AM
3
polycarbonate layer
first layer
polycarbonate layer
second layer
laser reads layer 1
laser reads layer 2
3 Hardware
▲ Figure 3.8 Dual layering in a DVD
Standard, single layer DVDs still have a larger storage capacity than CDs because the ‘pit’ size and track width are both smaller. This means that more data can be stored on the DVD surface. DVDs use lasers with a wavelength of 650 nanometres; CDs use lasers with a wavelength of 780 nanometres. The shorter the wavelength of the laser light, the greater the storage capacity of the medium. » Blu-ray discs are another example of optical storage media. However, they are fundamentally different to DVDs in their construction and in the way they carry out read-write operations. » Blu-ray uses a blue laser, rather than a red laser, to carry out read and write operations; the wavelength of blue light is only 405 nanometres (compared to 650 nm for red light). » Using blue laser light means that the ‘pits’ and ‘bumps’ can be much smaller; consequently, a Blu-ray can store up to five times more data than a DVD. » Blu-ray uses a single 1.1 mm thick polycarbonate disk; DVDs use a sandwich of two 0.6 mm thick disks. » Using two sandwiched layers can cause birefringence (light is refracted into two separate beams causing reading errors); because Blu-ray uses only one layer, the discs do not suffer from birefringence. » Blu-ray discs automatically come with a secure encryption system which helps to prevent piracy and copyright infringement.
Table 3.4 summarises the main differences between CDs, DVDs and Blu-ray. track pitch (distance between tracks)
disk type
laser colour
wavelength of laser light
CD
red
780 nm
single 1.2 mm polycarbonate layer
DVD
red
650 nm
two 0.6 mm 0.74 µm polycarbonate layers
Blu-ray
blue
405 nm
single 1.1 mm polycarbonate layer
disk construction
1.60 µm
0.30 µm
nm = 10 −9 metres µm = 10 −6 metres ▲ Table 3.4 Main differences between CDs, DVDs and Blu-ray
All these optical storage media are used as back-up systems (for photos, music and multimedia files). This also means that CDs and DVDs can be used to transfer files between computers. Manufacturers sometimes supply their software (such as printer drivers) on CDs and DVDs. When the software is supplied in this way, the disk is usually in a read-only format. The most common use of DVD and Blu-ray is the supply of movies or games. The memory capacity of CDs is not big enough to store most movies. 76
457591_03_CI_AS & A_Level_CS_068-106.indd 76
26/04/19 7:27 AM
EXTENSION ACTIVITY 3D
Find out more about this technology and determine whether this could result in the demise of the current solid state removable devices.
3.1.2 Input and output devices This section will consider laser printers, inkjet printers, 3D printers, speakers, microphones, screens and sensors.
Laser printers Laser printers use dry powder ink rather than liquid ink and make use of the properties of static electricity to produce the text and images. Unlike inkjet printers, for example, laser printers print the whole page in one go. Colour laser printers use four toner cartridges – blue, cyan, magenta and black. Although the actual technology is different to monochrome printers, the printing method is similar, but colour dots are used to build up the text and images. ▲ Figure 3.9 A laser printer
3 3.1 Computers and their components
A recent development is PRAM (parameter RAM) or PCRAM (phase-change RAM) which utilises chalogenide glass. This is glass containing elements such as sulphur, antimony, selenium, germanium or tellurium. Chalogenide compounds used in PRAMs/PCRAMs can be changed between the amorphous (glass-like) state and crystalline state, which changes the optical and electrical properties allowing the storage of data when used as a film on the surface of optical media.
When a user wishes to print a document using a laser printer, the following sequence of events takes place. Stage
Description of what happens
1
data from the document is sent to a printer driver
2
printer driver ensures that the data is in a format that the chosen printer can understand
3
check is made by the printer driver to ensure that the chosen printer is available to print (is it busy? is it off-line? is it out of ink? and so on)
4
data is sent to the printer and stored in a temporary memory known as a printer buffer
5
printing drum given a positive charge. As this drum rotates, a laser beam scans across it removing the positive charge in certain areas, leaving negatively charged areas which exactly match the text/images of the page to be printed
6
drum is coated with positively charged toner (powdered ink). Since the toner is positively charged, it only sticks to the negatively charged parts of the drum
7
negatively charged sheet of paper is rolled over the drum
8
toner on the drum sticks to the paper to produce an exact copy of the page sent to the printer
9
to prevent the paper sticking to the drum, the electric charge on the paper is removed after one rotation of the drum
10
the paper goes through a fuser (a set of heated rollers), where the heat melts the ink so that it fixes permanently to the paper
11
a discharge lamp removes all the electric charge from the drum so it is ready to print the next page
▲ Table 3.5 Sequence to print using a laser printer 77
457591_03_CI_AS & A_Level_CS_068-106.indd 77
26/04/19 7:27 AM
Inkjet printers Inkjet printers are made up of
3 Hardware
3
▲ Figure 3.10 An inkjet printer
» a print head consisting of nozzles that spray droplets of ink onto the paper to form characters » an ink cartridge or cartridges; either one cartridge for each colour (blue, yellow and magenta) and a black cartridge, or one single cartridge containing all three colours and black (note: some systems use six colours) » a stepper motor and belt which moves the print head assembly across the page from side to side » a paper feed which automatically feeds the printer with pages as they are required.
The ink droplets are currently produced using one of two technologies: thermal bubble or piezoelectric. Thermal bubble – tiny resistors create localised heat which makes the ink vaporise. This causes the ink to form a tiny bubble, as the bubble expands some of the ink is ejected from the print head onto the paper. When the bubble collapses, a small vacuum is created which allows fresh ink to be drawn into the print head. This continues until the printing cycle is completed. Piezoelectric – a crystal is located at the back of the ink reservoir for each nozzle. The crystal is given a tiny electric charge which makes it vibrate. This vibration forces ink to be ejected onto the paper and at the same time more ink is drawn in for further printing. When a user wishes to print a document using an inkjet printer, the following sequence of events takes place. Whatever technology is used, the basic steps in the printing process are the same. Stage
Description of what happens
1
data from the document is sent to a printer driver
2
printer driver ensures that the data is in a format that the chosen printer can understand
3
check is made by the printer driver to ensure that the chosen printer is available to print (is it busy? is it off-line? is it out of ink? and so on)
4
data is sent to the printer and stored in a temporary memory known as a printer buffer
5
a sheet of paper is fed into the main body of the printer. A sensor detects whether paper is available in the paper feed tray – if it is out of paper (or the paper is jammed), an error message is sent back to the computer
6
as the sheet of paper is fed through the printer, the print head moves from side to side across the paper printing the text or image. The four ink colours are sprayed in their exact amounts to produce the desired final colour
7
at the end of each full pass of the print head, the paper is advanced very slightly to allow the next line to be printed. This continues until the whole page has been printed
8
if there is more data in the printer buffer, then the whole process from stage 5 is repeated until the buffer is empty
9
once the printer buffer is empty, the printer sends an interrupt to the processor in the computer, which is a request for more data to be sent to the printer. The process continues until the whole of the document has been printed
▲ Table 3.6 Sequence to print using a laser printer 78
457591_03_CI_AS & A_Level_CS_068-106.indd 78
26/04/19 7:27 AM
3D printers
3 3.1 Computers and their components
▲ Figure 3.11 A 3D printer
3D printers are used to produce working, solid objects. They are primarily based on inkjet and laser printer technology. The solid object is built up layer by layer using materials such as powdered resin, powdered metal, paper or ceramic. The artificial bone framework in Figure 3.12 was made from many layers (100 µm thick) of powered metal using a technology known as binder 3D printing. ▲ Figure 3.12 Artificial bone framework made using an industrial 3Dprinter
Various types of 3D printers exist; they range from the size of a microwave oven up to the size of a small car. 3D printers use additive manufacturing (the object is built up layer by layer); this is in contrast to the more traditional method of subtractive manufacturing (removal of material to make the object). For example, making a statue using a 3D printer would involve building it up layer by layer using powdered stone until the final object was formed. The subtractive method would involve carving the statue out of solid stone (removing the stone not required) until the final item was produced. Similarly, CNC machining removes metal to form an object; 3D printing would produce the same item by building up the object from layers of powdered metal. Direct 3D printing uses inkjet technology; a print head can move left to right as in a normal printer. However, the print head can also move up and down to build up the layers of an object. Binder 3D printing is similar to direct 3D printing. However, this method uses two passes for each of the layers; the first pass sprays dry powder and then on the second pass a binder (a type of glue) is sprayed to form a solid layer. Newer technologies use lasers and UV light to harden liquid polymers; this further increases the diversity of products which can be made. 79
457591_03_CI_AS & A_Level_CS_068-106.indd 79
26/04/19 7:27 AM
3 Hardware
3
Speakers and microphones Speakers Digitised sound stored in a file on a computer can be converted into sound as follows: » The digital data is first passed through a digital to analogue converter (DAC) where it is converted into an electric current. » This is then passed through an amplifier (since the current generated by the DAC will be small) to create a current large enough to drive a loudspeaker. » This electric current is then fed to a loudspeaker where it is converted into sound.
The following schematic shows how this is done.
10010101011
DAC
amplifer speaker
▲ Figure 3.13 Digital to analogue conversion
As Figure 3.13 shows, if the sound is stored in a computer file, it must first pass through a digital to analogue converter (DAC) to convert the digital data into an electric current which can be used to drive the loudspeaker. Figure 3.14 shows how a loudspeaker can convert electric signals into sound waves. plastic or paper cone sound waves
permanent magnet coil of wire wrapped around an iron core
sound waves produced
electric current fed to wire
▲ Figure 3.14 Diagram showing how a loudspeaker works
» When an electric current flows through a coil of wire that is wrapped around an iron core, the core becomes a temporary electromagnet; a permanent magnet is also positioned very close to this electromagnet. » As the electric current through the coil of wire varies, the induced magnetic field in the iron core also varies. This causes the iron core to be attracted towards the permanent magnet and as the current varies this will cause the iron core to vibrate. » Since the iron core is attached to a cone (made from paper or thin synthetic material), this causes the cone to vibrate, producing sound.
The rate at which the DAC can translate the digital output into analogue voltages is known as the sampling rate. If the DAC is a 16-bit device, then it 80
457591_03_CI_AS & A_Level_CS_068-106.indd 80
26/04/19 7:27 AM
can accept numbers between +32 767 (216 – 1) and –32 768 (216); the digital value containing all zeros is ignored. Microphones Microphones are either built into the computer or are external devices connected through the USB port or through wireless connectivity.
cone
coil wrapped around a permanent magnet
sound waves
output from the microphone
diaphragm
▲ Figure 3.15 Diagram of how a microphone works
» When sound is created, it causes the air to vibrate. » When a diaphragm in the microphone picks up the air vibrations, the diaphragm also begins to vibrate. » A copper coil is wrapped around a permanent magnet and the coil is connected to the diaphragm using a cone. As the diaphragm vibrates, the cone moves in and out causing the copper coil to move backwards and forwards. » This forwards and backwards motion causes the magnetic field around the permanent magnet to be disturbed, inducing an electric current. » The electric current is then either amplified or sent to a recording device. The electric current is analogue in nature.
3.1 Computers and their components
Figure 3.15 shows how a microphone can convert sound waves into an electric current. The current produced can either be stored as sound (on, for example, a CD), amplified and sent to a loudspeaker, or sent to a computer for storage.
3
The electric current output from the microphone can also be sent to a computer where a sound card converts the current into a digital signal which can then be stored in the computer. The following diagram shows what happens when the word ‘hut’ is picked up by a microphone and is converted into digital values: 1000 0001 0001 1110 1000 1110 0001 1100 1100 1100 1101 1110 sound wave for ‘HUT’
digital value after conversion
▲ Figure 3.16 Analogue to digital conversion
Look at Figure 3.16. The word ‘hut’ (in the form of a sound wave) has been picked up by a microphone; this is then converted using an analogue to digital converter (ADC) into digital values which can then be stored in a computer or manipulated as required using appropriate software.
Screens Screens are used to show the output from a computer. Modern screens use an LCD, backlit with LEDs or the newer organic light emitting diode (OLED) technology. 81
457591_03_CI_AS & A_Level_CS_068-106.indd 81
26/04/19 7:27 AM
3
Figure 3.17 shows a simplified form of how OLED technology works. glass or plastic top layer negative charges
metallic cathode (negative charge) emissive layer conductive layer
3 Hardware
positive charges
glass anode (positive charge) glass or plastic bottom layer
▲ Figure 3.17 Simplified form of how OLED technology works
OLEDs use organic materials (made up of carbon compounds) to create flexible semiconductors. Organic films are sandwiched between two charged electrodes (one is a metallic cathode and the other a glass anode). When an electric field is applied to the electrodes, they give off light. This means that no form of back lighting is required. This allows for very thin screens. It also means that there is no longer a need to use LCD technology, since OLED is a self-contained system. Screen displays are based on the pixel (the smallest picture element) concept where each screen pixel is made up of three sub-pixels, which are red, green and blue. By varying the intensity of the three sub-pixels, it is possible to generate millions of colours. The greater the number of pixels on a screen, the greater is the screen resolution (the number of pixels which can be viewed horizontally and vertically on screen; for example, 1680 × 1080 pixels). LCD and OLED screens use this type of pixel matrix to make up the picture.
The ‘purple’ pixel is made up of a combination of three sub-pixels, which are red, green and blue, in the required intensity, to ‘fool’ the eye into seeing a purple dot on the screen. The whole screen is filled with thousands of these tiny pixels.
▲ Figure 3.18 The pixel matrix
Touch screens (which act as both input and output devices) also make use of LCD and OLED technology. They are particularly used in mobile phones and tablets. We shall now consider LCD capacitive and resistive touch screen technologies. Capacitive » Made up of many layers of glass that act like a capacitor creating electric fields between the glass plates in layers. » When the top glass layer is touched, the electric current changes and the coordinates where the screen was touched are determined by an on board microprocessor. Benefits » Medium cost technology. » Screen visibility is good even in strong sunlight. » Permits multi-touch capability. » Screen is very durable; it takes a major impact to break the glass. 82
457591_03_CI_AS & A_Level_CS_068-106.indd 82
26/04/19 7:27 AM
Drawbacks » Only allows use of bare fingers as the form of input; although the latest screens permit the use of a special stylus to be used.
Benefits » Relatively inexpensive technology. » Possible to use bare fingers, gloved fingers or stylus to carry out an input operation. Drawbacks » Screen visibility is poor in strong sunlight. » Does not permit multi-touch capability. » Screen durability is only fair; it is vulnerable to scratches and the screen wears out through time.
3.1 Computers and their components
Resistive » Makes use of an upper layer of polyester (a form of plastic) and a bottom layer of glass. » When the top polyester layer is touched, the top layer and bottom layer complete a circuit. » Signals are then sent out, which are interpreted by a microprocessor and the calculations determine the coordinates of where the screen was touched.
3
Virtual headsets Virtual reality has now been around for many years and has many applications. For example, it is possible to ‘walk around’ inside dangerous areas – such as a nuclear power plant – without actually being there. It allows engineers to plan modifications or repairs to a plant in complete safety and to try out different scenarios first before implementing them. One of the devices used is a virtual reality headset which gives the engineer the feeling of being there. We will now describe how these devices work. » Video is sent from a computer to the headset (either using an HDMI cable or a smartphone fitted into the headset). » Two feeds are sent to an LCD/OLED display (sometimes two screens are used, one for the left side of the image and one for the right side of the image); lenses placed between the eyes and the screen allow for focusing and reshaping of the image/video for each eye, thus giving a 3D effect and adding to the realism. » Most headsets use 110° field of view which is enough to give a pseudo 360° surround image/video. » A frame rate of 60 to 120 images per second is used to give a true/realistic image. » As the user moves their head (up and down or left to right), a series of sensors and/or LEDs measure this movement, which allows the image/video on the screen to react to the user’s head movements (sensors are usually gyroscopic or accelerometers; LEDs are used in conjunction with mini cameras to further monitor head movements). » Headsets also use binaural sound (surround sound) so that the speaker output appears to come from behind, from the side or from a distance, giving very realistic 3D sound.
83
457591_03_CI_AS & A_Level_CS_068-106.indd 83
26/04/19 7:27 AM
3 Hardware
3
» Some headsets also use infrared sensors to monitor eye movement (in addition to head movement), which allows the depth of field on the screen to be more realistic; an example of this is to make objects in the foreground appear fuzzy when the user’s eyes indicate they are looking into the distance (and vice versa).
Sensors Sensors are input devices which read or measure physical properties, such as temperature, pressure, acidity, and so on. Real data is analogue in nature – this means it is constantly changing and does not have a discrete value. Analogue data usually requires some form of interpretation, for example, the temperature shown on a mercury thermometer requires the user to look at the height of the mercury to work out the temperature. The temperature, therefore, can have an infinite number of values depending on the precision of how the height of the mercury is measured. Equally, an analogue clock requires the user to look at the hands on the clock face. The area swept out by the hands allows the number of hours and minutes to be interpreted. There are many other examples. Computers cannot make any sense of these physical quantities and the data needs to be converted into a digital format. This is usually achieved by an analogue to digital converter (ADC). This device converts physical values into discrete digital values. ADC analogue data
1 0 0 1 1 1 0 0 ... digital data
▲ Figure 3.19 Converting analogue data into digital data
When a computer is used to control devices, such as a motor or a valve, it is often necessary to use a digital to analogue converter (DAC), since these devices need analogue data to operate in many cases. Frequently, an actuator is used in these control applications. Although these are technically output devices, they are mentioned here since they are an integral part of the control system. An actuator is an electromechanical device such as a relay, solenoid or motor. Note that a solenoid is an example of a digital actuator as part of the device is connected to a computer which opens and closes a circuit as required. When energized, the solenoid may operate a plunger or armature to control, for example, a fuel injection system. Other actuators, such as motors and valves, may require a DAC so that they receive an electric current rather than a simple digital signal direct from the computer. Notice the importance of (positive) feedback, which is where the output from the system can affect the next input. This is due to the fact that sensor readings may cause the microprocessor to alter a valve or a motor, for example, which will then change the next reading taken by the sensor. So the output from the microprocessor will impact on the next input received as it attempts to bring the system within the desired parameters.
84
457591_03_CI_AS & A_Level_CS_068-106.indd 84
26/04/19 7:27 AM
Table 3.7 shows a number of common sensors and examples of their applications. Sensor
Example applications
temperature
n
3
control a central heating system control/monitor a chemical process n control/monitor temperature in a greenhouse n
n
control/monitor moisture/humidity levels in soil/air in a greenhouse n monitor dampness levels in an industrial application (for example, monitor moisture in a paint spray booth in a car factory)
light
n
switch street lighting on at night and off during the day monitor/control light levels in a greenhouse n switch on car headlights when it gets dark n
infrared/motion
n
turn on windscreen wipers on a car when it rains detect an intruder in a burglar alarm system n count people entering or leaving a building n
pressure
n
detect intruders in a burglar alarm system check weight (such as the weight of a vehicle) n monitor/control a process where gas pressure is important n
acoustic/sound
n
gas (such as O2 or CO2)
n
pH
n
pick up noise levels (such as footsteps or breaking glass) in a burglar alarm system n detect noise of liquids dripping from a pipe monitor pollution levels in a river or air measure O2 and CO2 levels in a greenhouse n check for CO2 or NO2 leaks in a power station n
n
magnetic field
3.1 Computers and their components
moisture/humidity
n n
monitor/control acidity/alkalinity levels in soil monitor pollution in rivers detect changes in in cell phones, CD players, and so on used in anti-lock braking systems in motor vehicles
▲ Table 3.7 Common sensors and examples of applications
Sensors are used in both monitoring and control applications. There is a subtle difference between how these two methods work. The flowchart (Figure 3.21 overleaf) shows a simplification of the process.
85
457591_03_CI_AS & A_Level_CS_068-106.indd 85
26/04/19 7:27 AM
sensors send signals to the microprocessor or computer
3
the signals are converted to digital (if necessary) using an analogue to digital converter (ADC)
3 Hardware
the microprocessor or computer analyses the data received by checking it against stored values
if new data is outside the acceptable range, a warning message is sent to a screen or an alarm is activated
if the new data is outside the acceptable range, the microprocessor or computer sends signals to control valves, motors, and so on
the microprocessor or computer has no effect on what is being monitored – it is simply ‘watching’ the process
the output from the system affects the next set of inputs from the sensors feedback loop
monitoring system
control system
▲ Figure 3.20 Sensors for monitoring and controlling systems
Table 3.8 shows some examples of monitoring and control applications of sensors. Examples of monitoring
Examples of control
n
n
monitoring a patient in a hospital for vital signs such as heart rate, temperature, and so on n checking for intruders in a burglar alarm system n checking the temperature levels in a car engine n monitoring pollution levels in a river
n n n n
turning street lights on at night and turning them off again during daylight controlling the temperature in a central heating/air conditioning system controlling the traffic lights at a road junction operating anti-lock brakes on a car when necessary controlling the environment in a greenhouse
▲ Table 3.8 Examples of monitoring and control applications of sensors
One of the most common uses of sensors in modern times is in the monitoring and control of a number of functions in motor vehicles and aeroplanes. Look at Figure 3.21 showing a typical modern car and its many sensors used to control or monitor several functions.
86
457591_03_CI_AS & A_Level_CS_068-106.indd 86
26/04/19 7:27 AM
front airbag sensors
rear lighting control
3
collision avoidance system
3.1 Computers and their components
front lighting control
engine management
ABS
▲ Figure 3.21 Sensors on a typical modern car
Below is an in-depth look at just one of the sensor systems labelled on Figure 3.21. Anti-lock braking systems (on cars) Anti-lock braking systems (ABS) on cars use magnetic field sensors to stop the wheels locking up on the car if the brakes have been applied too sharply. » When one of the car wheels rotates too slowly (it is locking up), a magnetic field sensor sends data to a microprocessor. » The microprocessor checks the rotation speed of the other three wheels. » If they are different (rotating faster), the microprocessor sends a signal to the braking system and the braking pressure to the affected wheel is reduced. » The wheel’s rotational speed is then increased to match the other wheels. » The checking of the rotational speed using these magnetic field sensors is done several times a second and the braking pressure to all the wheels can be constantly changing to prevent any of the wheels locking up under heavy braking. » This is felt as a ‘judder’ on the brake pedal as the braking system is constantly switched off and on to equalise the rotational speed of all four wheels. » If one of the wheels is rotating too quickly, braking pressure is increased to that wheel until it matches the other three.
ACTIVITY 3A 1 a) i) Describe three differences between RAM and ROM. ii) Compare the relative advantages and disadvantages of SRAM and DRAM. Include examples of where each type of memory would be used in a computer. 87
457591_03_CI_AS & A_Level_CS_068-106.indd 87
26/04/19 7:27 AM
b) Secondary storage can be magnetic, optical or solid state. Describe two features of each type of storage which differentiates it from the other two types. 2 a) Explain the main differences in operation of a laser printer compared with an inkjet printer. b) i) Name one application of a laser printer and one application of an inkjet printer. ii) For each of your named applications in part b) i), give a reason why the chosen printer is the most suitable. 3 An art gallery took several photographs of a valuable, fragile painting. The images were sent to a computer where they were processed by a 3D printing application. A 3D printout of the painting was produced showing the texture of the oil paint, canvas and any flaws in the painting. Give reasons why the art gallery would wish to make this 3D replica. 4 The following diagram shows a schematic of a microprocessor-controlled street lighting system.
3 Hardware
3
sensor ADC
street light
microprocessor
DAC
The microprocessor is used to control the operation of the street lamp. The lamp is fitted with a light sensor which constantly sends data to the microprocessor. The data value from the sensor changes according to whether it is sunny, cloudy, raining, night time, and so on. Describe how the microprocessor would be used to automatically switch on the light at night and switch it off again when it becomes light. Include a feature to stop the light constantly flickering on and off when it becomes overcast or cars go past with full headlights at night.
EXTENSION ACTIVITY 3E 1 Look at this simplified diagram of a keyboard; the letter H has been pressed. Explain: a) how pressing the letter H has been recognised by the computer b) how the computer manages the very slow process of inputting data from a keyboard.
G
2 a) Describe how these types of pointing devices work. i) Mechanical mouse ii) Optical mouse b) Connectivity between mouse and computer can be through USB cable or wireless. Explain these two types of connectivity. J
H
letter H has been pressed and now makes contact with bottom conductive layer
conductive layers
insulating layer
letter H interpreted by computer
88
457591_03_CI_AS & A_Level_CS_068-106.indd 88
26/04/19 7:27 AM
EXTENSION ACTIVITY 3F Another new screen technology is known as quantum LED (QLED), which is in direct competition with organic (LED). Look at this statement: ‘QLED televisions are simply LED televisions that use quantum dots to enhance their overall performance in key picture quality areas.’ Find out the main differences between QLED and OLED technologies.
Key terms Logic gates – electronic circuits which rely on ‘on/off’ logic. The most common ones are NOT, AND, OR, NAND, NOR and XOR. Logic circuit – formed from a combination of logic gates and designed to carry out a particular task. The output from a logic circuit will be 0 or 1.
Truth table – a method of checking the output from a logic circuit. They use all the possible binary input combinations depending on the number of inputs; for example, two inputs have 22 (4) possible binary combinations, three inputs will have 2 3 (8) possible binary combinations, and so on.
3.2 Logic gates and logic circuits
3.2 Logic gates and logic circuits
3
Boolean algebra – a form of algebra linked to logic circuits and based on TRUE and FALSE.
3.2.1 Logic gates Electronic circuits in computers, many memories and controlling devices are made up of thousands of logic gates. Logic gates take binary inputs and produce a binary output. Several logic gates combined together form a logic circuit and these circuits are designed to carry out a specific function. The checking of the output from a logic gate or logic circuit can be done using a truth table. This section will consider the function and role of logic gates, logic circuits and truth tables. A number of possible applications of logic circuits will also be considered. A reference to Boolean algebra will be made throughout this section, although this is covered in more depth in Chapter 15. Six different logic gates will be considered in this section.
NOT gate
AND gate
OR gate
NAND gate
NOR gate
XOR gate
▲ Figure 3.22 Six types of logic gate
89
457591_03_CI_AS & A_Level_CS_068-106.indd 89
02/05/19 12:35 PM
3 Hardware
3
3.2.2 Truth tables Truth tables are used to trace the output from a logic gate or logic circuit. The NOT gate is the only logic gate with one input; the other five gates have two inputs. When constructing truth tables, all possible combinations of 1s and 0s which can be input are considered. For the NOT gate (one input) there are only 21 (2) possible binary combinations. For all other gates (two inputs), there are 22 (4) possible binary combinations. For logic circuits, the number of inputs can be more than 2; for example, three inputs give a possible 23 (8) binary combinations. And for four inputs, the number of possible binary combinations is 24 (16). It is clear that the number of possible binary combinations is a multiple of the number 2 in every case. Table 3.9 summarises this. Inputs
Inputs
Inputs
A
B
A
B
C
A
B
C
D
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
▲ Table 3.9
3.2.3 The function of the six logic gates NOT gate A
X
▲ Figure 3.23 NOT gate
Description The output, X, is 1 if the input A is NOT 1 How to write this X = NOT A (logic notation) – X = A (Boolean algebra)
Truth table Input
Output
A
X
1
1
▲ Table 3.10 90
457591_03_CI_AS & A_Level_CS_068-106.indd 90
02/05/19 12:36 PM
AND gate A B
3
X
▲ Figure 3.24 AND gate
Description The output, X, is 1 if input A is 1 and input B is 1
A 0 0 1 1
Inputs
B 0 1 0 1
Output X 0 0 0 1
B 0 1 0 1
Output X 0 1 1 1
B 0 1 0 1
Output X 1 1 1 0
B 0 1 0 1
Output X 1 0 0 0
▲ Table 3.11
OR gate A B
X
▲ Figure 3.25 OR gate
Description The output, X, is 1 if input A is 1 or input B is 1 How to write this X = A OR B (logic notation) X = A + B (Boolean algebra)
NAND gate (NOT AND) A B
Truth table Inputs A 0 0 1 1
3.2 Logic gates and logic circuits
How to write this X = A AND B (logic notation) X = A.B (Boolean algebra)
Truth table
▲ Table 3.12
X
▲ Figure 3.26 NAND gate
Description The output, X, is 1 if input A is NOT 1 or input B is NOT 1 How to write this X = A NAND B (logic notation) X = A.B (Boolean algebra)
Truth table Inputs A 0 0 1 1 ▲ Table 3.13
NOR gate (NOT OR) A B
X
▲ Figure 3.27 NOR gate
Description The output, X, is 1 if: input A is NOT 1 and input B is NOT 1 How to write this X = A NOR B (logic notation) X = A + B (Boolean algebra)
Truth table Inputs A 0 0 1 1 ▲ Table 3.14 91
457591_03_CI_AS & A_Level_CS_068-106.indd 91
02/05/19 12:36 PM
XOR gate
3
A B
X
▲ Figure 3.28 XOR gate
3 Hardware
Description Truth table The output, X, is 1 if (input A is 1 AND input B Inputs is NOT 1) OR (input A is NOT 1 AND input B A B is 1) How to write this X = A XOR B (logic notation) X = (A.B) + (A.B) (Boolean algebra) (Note: this is sometimes written as: (A + B) . A.B)
Output X
1
1
1
1
1
1
▲ Table 3.15
EXTENSION ACTIVITY 3G Using truth tables show that X = (A.B) + (A.B) and X = (A + B) . A.B both represent the XOR logic gate.
You will notice, in the Boolean algebra, three new symbols. » A dot (.) represents the AND operation (it can be written as ∧). » A plus sign (+) represents the OR operation (it can be written as ∨). » A dash above a letter (for example, A) represents the NOT operation.
3.2.4 Logic circuits When logic gates are combined to carry out a particular function, such as controlling a robot, they form a logic circuit. The output from the logic circuit is checked using a truth table. The following three examples show how to: » produce a truth table » design a logic circuit from a given logic statement/Boolean algebra » design a logic circuit to carry out an actual safety function.
Example 3.1
Produce a truth table for the following logic circuit (note the use of at junctions): A B
P
R X
Q C
part 1
part 2
part 3
92
457591_03_CI_AS & A_Level_CS_068-106.indd 92
02/05/19 12:36 PM
Solution There are three inputs to this logic circuit; therefore, there will be eight possible binary values which can be input.
3
To show step-wise how the truth table is produced, the logic circuit has been split up into three parts and intermediate values are shown as P, Q and R.
A B
P
Q
C
3.2 Logic gates and logic circuits
Part 1 This is the first part of the logic circuit; the first task is to find the intermediate values P and Q.
The value of P is found from the AND gate where the inputs are A and B. The value of Q is found from the NOR gate where the inputs are B and C. An intermediate truth table is produced: Inputs A
B
Outputs C
P
Q
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Part 2 The second part of the logic circuit has P and Q as inputs and the intermediate output, R. P Q
Inputs
R
This produces the following intermediate truth table (Note: even though there are only two inputs to the logic gate, we have generated eight binary values in Part 1 and these must all be used in this second truth table).
Output
P
Q
R
1
1
1
1
1
1
1
1 93
457591_03_CI_AS & A_Level_CS_068-106.indd 93
26/04/19 7:27 AM
Part 3 The final part of the logic circuit has R and C as inputs and the final output, X.
3
R
Inputs
X
C
3 Hardware
This gives the third intermediate truth table. Putting all three intermediate truth tables together produces the final truth table which represents the original logic circuit. Inputs
Intermediate values
Output
R
C
X
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Output
A
B
C
P
Q
R
X
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
ACTIVITY 3B Produce truth tables for each of the following logic circuits. You are advised to split them up into intermediate parts to help eliminate errors. a)
b)
c)
A
A B
A
X
X C
B
B
d)
e)
A B
A B X
X C
X
C
94
457591_03_CI_AS & A_Level_CS_068-106.indd 94
26/04/19 7:27 AM
Example 3.2
A safety system uses three inputs to a logic circuit. An alarm, X, sounds if input A represents ON and input B represents OFF, or if input B represents ON and input C represents OFF.
3
Produce a logic circuit and truth table to show the conditions which cause the output X to be 1.
Solution
So, we get the following logic statement: X = 1 if
(A = 1 AND B = NOT 1) this equates to A is ON and B is OFF Part 1
OR the two parts are connected by the OR gate Part 2
(B = 1 AND C = NOT 1) this equates to B is ON AND C is OFF Part 3
3.2 Logic gates and logic circuits
The first thing to do is to write down the logic statement representing the scenario in this example. To do this, it is necessary to recall that ON = 1 and OFF = 0 and also that 0 is considered to be NOT 1.
This statement can also be written in Boolean algebra as: (A.B) + (B.C) The logic circuit is made up of three parts as shown in the logic statement. We will produce the logic gate for the Part 1 and Part 3, then join both parts together with the OR gate. A
B
B
C Part 1
Part 3
Now, combining both parts with Part 2 (the OR gate) gives us: A Part 1
Part 2 X
B
Part 3 C
There are two ways to produce the truth table. l Trace through the logic circuit using the method described in Example 3.1. l Use the original logic statement; this allows you to check that your logic circuit is correct. 95
457591_03_CI_AS & A_Level_CS_068-106.indd 95
26/04/19 7:27 AM
We will use the second method in this example.
3 3 Hardware
Inputs
Intermediate values
Output
A
B
C
(A=1 AND B=NOT 1)
(B=1 AND C=NOT 1)
X
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
ACTIVITY 3C Draw the logic circuits and complete the truth tables for these logic statements and Boolean algebra statements. a) X = 1 if (A = 1 OR B = 1) OR (A = 0 AND B = 1) b) Y = 1 if (A = 0 AND B = 0) AND (B = 0 OR C = 1) c) T = 1 if (switch K is ON or switch L is ON) OR (switch K is ON and switch M is OFF) OR (switch M is ON) d) X = (A.B) + (B.C) e) R = 1 if (switch A is ON and switch B is ON) AND (switch B is ON or switch C is OFF)
Example 3.3
A wind turbine has a safety system which uses three inputs to a logic circuit. A certain combination of conditions results in an output, X, from the logic circuit being equal to 1. When the value of X = 1, the wind turbine is shut down. The following table shows which parameters are being monitored and form the three inputs to the logic circuit. Parameter description turbine speed
Parameter
Binary value
Description of condition
S
turbine speed ≤ 1000 rpm
1 bearing temperature
T
0 1
wind velocity
W
0 1
turbine speed > 1000 rpm
bearing temperature ≤ 80 °C
bearing temperature > 80 °C
wind velocity ≤ 120 kph
wind velocity > 120 kph
The output, X, will have a value of 1 if any of the following combination of conditions occur: l either turbine speed ≤ 1000 rpm and bearing temperature > 80 °C l or turbine speed > 1000 rpm and wind velocity > 120 kph l or bearing temperature ≤ 80 °C and wind velocity > 120 kph 96
457591_03_CI_AS & A_Level_CS_068-106.indd 96
26/04/19 7:27 AM
Design the logic circuit and complete the truth table to produce a value of X = 1 when either of the three conditions occur.
Solution
Stage 1 The first thing to do is to convert each of the three statements into logic statements. Use the information given in the table and the three condition statements to find how the three parameters S, T and W, are linked. We usually look for the key words AND, OR and NOT when converting actual statements into logic. We end up with these three logic statements: ① turbine speed 1000 rpm and bearing temperature > 80 °C logic statement: (S = NOT 1 AND T = 1)
3.2 Logic gates and logic circuits
This is a different type of problem to those covered in Examples 3.1 and 3.2. This time, a real situation is given and it is necessary to convert the information into a logic statement and then produce the logic circuit and truth table. It is advisable in problems as complex as this to produce the logic circuit and truth table separately (based on the conditions given) and then check them against each other to see if there are any errors.
3
② turbine speed > 1000 rpm and wind velocity > 120 kph logic statement: (S = 1 AND W = 1)
③ bearing temperature 80 °C and wind velocity > 120 kph logic statement: (T = NOT 1 AND W = 1) Stage 2 This produces three intermediate logic circuits:
① ② ③
S T
S W
T W
Each of the three original statements were joined together by the word OR. So, we need to join all of the three intermediate logic circuits by two OR gates to get the final logic circuit. We will start by joining ① and ② together using an OR gate. S T
W
97
457591_03_CI_AS & A_Level_CS_068-106.indd 97
26/04/19 7:27 AM
3
Now, we connect this to logic circuit ③ to obtain the final logic circuit. S T
3 Hardware
W
W
The final part is to produce the truth table. We will do this using the original logic statement, since this method allows an extra check to be made on the final logic circuit. There were three parts to the problem, so the truth table will first evaluate each part. Then, by applying OR gates, as shown below, the final value, X, is obtained: ① (S = NOT 1 AND T = 1) ② (S = 1 AND W = 1)
③ (T = NOT 1 AND W = 1)
We find the outputs from ① and ② and then OR these two outputs to obtain a new intermediate, which we will label part ④. We then OR parts ③ and ④ together to get the value of X. Inputs
Intermediate values
Output
A
B
C
① (S=NOT 1 AND T=1)
② (S=1 AND W=1)
③ (T=NOT 1 AND W=1)
④
X
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
ACTIVITY 3D There are two scenarios described below. In each case, produce the logic circuit and complete a truth table to represent the scenario. a) A chemical process is protected by a logic circuit. There are three inputs to the logic circuit representing key parameters in the chemical process. An alarm, X, will give an output value of 1 depending on certain conditions in the chemical process. 98
457591_03_CI_AS & A_Level_CS_068-106.indd 98
26/04/19 7:27 AM
This table describes the process conditions being monitored. Parameter description
chemical reaction rate
R
process temperature
T
concentration of chemicals
C
Binary value 0
Description of condition reaction rate < 40 mol/l/sec
1
reaction rate 40 mol/l/sec
temperature > 115 °C
1
temperature 115 °C
concentration = 4 mol
1
concentration > 4 mol
An alarm, X, will generate the value 1 if: either reaction rate < 40 mol/l/sec or concentration > 4 mol AND temperature > 115 °C or reaction rate 40 mol/l/sec AND temperature > 115 °C.
b) A power station has a safety system controlled by a logic circuit. Three inputs to the logic circuit determine whether the output, S, is 1. When S = 1 the power station shuts down. The following table describes the conditions being monitored. Parameter description gas temperature
Parameter
Binary value
Description of condition
G
gas temperature 160 °C
1 reactor pressure water temperature
R W
3
gas temperature > 160 °C
reactor pressure 10 bar
1
reactor pressure > 10 bar
0 1
3.2 Logic gates and logic circuits
Parameter
water temperature 120 °C water temperature > 120 °C
Output, S, will generate a value of 1, if: either gas temperature > 160 °C AND water temperature 120 °C or gas temperature 160 °C AND reactor pressure > 10 bar or water temperature > 120 °C AND reactor pressure > 10 bar.
3.2.5 Logic circuits in the real world
The design of logic circuits is considerably more complex than has, so far, been described. We have discussed some of the fundamental theories, providing sufficient coverage of the Cambridge International A Level syllabus. However, it is worth discussing some of the more advanced aspects of logic circuit design, to strengthen understanding. Electronics companies need to consider the cost of components, ease of fabrication and time constraints when designing and building logic circuits. Ways electronics companies review logic circuit design include: » using ‘off-the-shelf’ logic units and building up the logic circuit as a number of ‘building blocks’ » simplifying the logic circuit as far as possible; this may be necessary where room is at a premium (for example, building circuit boards for use in satellites for space exploration). 99
457591_03_CI_AS & A_Level_CS_068-106.indd 99
26/04/19 7:27 AM
3
Using logic ‘building blocks’ One common ‘building block’ is the NAND gate. It is possible to build up any logic gate, and therefore any logic circuit, by simply linking together a number of NAND gates, such as: » the AND gate A B
3 Hardware
▲ Figure 3.29 AND gate made from NAND gates
» the OR gate A B
▲ Figure 3.30 OR gate made from NAND gates
» the NOT gate A
▲ Figure 3.31 NOT gate made from NAND gates
ACTIVITY 3E 1 By drawing the truth tables, show that the three logic circuits shown above can be used to represent AND, OR and NOT gates. 2 a) Show how the following logic circuit could be built using NAND gates only. Complete truth tables for both logic circuits to show that they produce identical outputs. A B X C
b) Show how the XOR gate could be built from NAND gates only. Complete a truth table for your final design to show that it produces the same output as a single XOR gate. 3 By drawing a truth table, discover which single logic gate has the same function as the following logic circuit made up of NAND gates only. A X B
100
457591_03_CI_AS & A_Level_CS_068-106.indd 100
26/04/19 7:27 AM
Simplification of logic circuits The second method involves the simplification of logic circuits. By reducing the number of components, the cost of production can be less. This can also improve reliability and make it easier to trace faults if they occur. This is covered in more depth in Chapter 15.
3
EXTENSION ACTIVITY 3H
A
X
B
3.2.6 Multi-input logic gates
3.2 Logic gates and logic circuits
By drawing a truth table, show which single logic gate has the same function as the logic circuit drawn below.
This section looks at logic gates with more than two inputs (apart from the NOT gate). Students are not expected to answer questions about multi-input logic gates at Cambridge International AS Level, but this information is included here for completeness and for those with an electronics background. This is intended to complete the picture for interested students who may have seen multi-input gates in other textbooks, or online, and it leads neatly into topics covered in Chapter 15. Logic gates (apart from the NOT gate) can have more than two inputs. While it is still acceptable to use two-input logic gates, it is worth considering the multiinput option when designing logic circuits; they can simplify the overall result.
Multi-input AND gates A B is the same as
A B C
C
▲ Figure 3.32 Multi-input AND gate
Both sets of AND gates have the output A.B.C and they share identical truth tables. A 0 0 0 0 1 1 1 1
Inputs B 0 0 1 1 0 0 1 1
C 0 1 0 1 0 1 0 1
Output A.B.C 0 0 0 0 0 0 0 1
▲ Table 3.16 101
457591_03_CI_AS & A_Level_CS_068-106.indd 101
4/30/19 7:48 AM
3
Now consider the following: A B
C
is the same as
A B C D
D
3 Hardware
▲ Figure 3.33 4-input AND gate
Both sets of AND gates have the output A.B.C.D and they share identical truth tables. Inputs A 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
B 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
C 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
Output A.B.C.D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
▲ Table 3.17
Multi-input OR gates A B is the same as
A B C
C
▲ Figure 3.34 Multi-input OR gate
Both sets of OR gates have the output A + B + C and they share identical truth tables. A 0 0 0 0 1 1 1 1
Inputs B 0 0 1 1 0 0 1 1
C 0 1 0 1 0 1 0 1
Output A+B+C 0 1 1 1 1 1 1 1
▲ Table 3.18 102
457591_03_CI_AS & A_Level_CS_068-106.indd 102
26/04/19 7:27 AM
Now consider the following: A B is the same as
C
A B C D
3
D
▲ Figure 3.35 4-input OR gate
Inputs A 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
B 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
C 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
Output A+B+C+D 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
3.2 Logic gates and logic circuits
Both sets of OR gates have the output A + B + C + D and they share identical truth tables.
▲ Table 3.19
ACTIVITY 3F 1 a) Draw the following multi-input NAND gate using two-input NAND gates only:
A B C D
b) Construct the truth tables for the above 4-input NAND gate and for your circuit drawn in part a). Confirm that they are identical. 2 a) Draw the following multi-input NOR gates using two-input NOR gates only.
A B C
A B C D
b) Construct the truth tables for the above 3-input NOR gate and for your equivalent circuit drawn in part a). Confirm they are identical. 103
457591_03_CI_AS & A_Level_CS_068-106.indd 103
26/04/19 7:27 AM
c) Construct the truth tables for the above 4-input NOR gate and for your equivalent circuit drawn in part a). Confirm they are identical. 3 Confirm that the following two logic circuits are identical by constructing the truth tables for each circuit.
3 Hardware
3
A B C
End of chapter questions
A B C
1 a) Many mobile phone and tablet manufacturers are moving to OLED screen technology. Give three reasons why this is happening. [3] b) A television manufacturer makes the following advertising claim: ‘Our OLED screens allow the user to enjoy over one million vivid colours in true-to-life vision.’ Comment on the validity of this claim. [4] 2 a) A company is developing a new games console. The game will be stored on a ROM chip once the program to run the new game has been fully tested and developed. i) Give two advantages of putting the game’s program on a ROM chip. [2] ii) Explain why the manufacturers would use an EPROM chip during development.[2] iii) The manufacturers are also using RAM chips on the internal circuit board. Explain why they are doing this. [2] iv) The games console will have four USB ports. Apart from the need to attach games controllers, give reasons why USB [2] ports are incorporated. b) During development of the games console the plastic parts are being made by a 3D printer. Give two reasons why the manufacturer would use 3D printers. [2] 3 An air conditioning unit in a car is being controlled by a microprocessor and a number of sensors. a) Describe the main differences between control and monitoring of a process. [2] b) Describe how the sensors and microprocessor would be used to control the air conditioning unit in the car. Name at least two different sensors that might be used and explain the role of positive feedback in your description. You might find drawing a diagram of your intended process to be helpful. [6]
104
457591_03_CI_AS & A_Level_CS_068-106.indd 104
26/04/19 7:27 AM
4 The nine stages in printing a page using an inkjet printer are shown below. They are not in the correct order. Write the letters A to I so that the stages are in the correct order.
[9]
The data is then sent to the printer and it is stored in a temporary memory known as a printer buffer.
B
As the sheet of paper is fed through the printer, the print head moves from side to side across the paper printing the text or image. The four ink colours are sprayed in their exact amounts to produce the desired final colour.
C
The data from the document is sent to a printer driver.
D
Once the printer buffer is empty, the printer sends an interrupt to the processor in the computer, which is a request for more data to be sent to the printer. The whole process continues until the whole of the document has been printed.
E
The printer driver ensures that the data is in a format that the chosen printer can understand.
F
At the end of each full pass of the print head, the paper is advanced very slightly to allow the next line to be printed. This continues until the whole page has been printed.
G
A check is made by the printer driver to ensure that the chosen printer is available to print (is it busy? is it off line? is it out of ink? and so on).
H
If there is more data in the printer buffer, then the whole process from stage 5 is repeated until the buffer is finally empty.
I
A sheet of paper is then fed into the main body of the printer, where a sensor detects whether paper is available in the paper feed tray – if it is out of paper (or the paper is jammed) then an error message is sent back to the computer.
3.2 Logic gates and logic circuits
A
3
5 a) There are two types of RAM: dynamic RAM (DRAM) and static RAM (SRAM). Five statements about DRAM and RAM are shown below. Copy the diagram below and connect each statement to the appropriate type of RAM.[5] Statement
Type of RAM
requires the data to be refreshed periodically in order to retain data has more complex circuitry
DRAM
does not need to be refreshed as the circuit holds the data as long as the power supply is on requires higher power consumption which is significant when used in battery-powered devices
SRAM
used predominantly in cache memory of processors where speed is important
b) Give three differences between RAM and ROM.
[3]
➔
457591_03_CI_AS & A_Level_CS_068-106.indd 105
105
26/04/19 7:27 AM
3 Hardware
3
c) DVD-RAM and f lash memory are two examples of storage devices. Describe two differences in how they operate. [2] Cambridge International AS & A Level Computer Science 9608 Paper 13 Q4 June 2015 6 a) Three digital sensors, A, B and C, are used to monitor a process. The outputs from the sensors are used as the inputs to a logic circuit. A signal, X, is output from the logic circuit: A B C
logic circuit
output X
Output, X, has a value of 1 if either of the following two conditions occur: – Sensor A outputs the value 1 OR sensor B outputs the value 0. – Sensor B outputs the value 1 AND sensor C outputs the value 0. Draw a logic circuit to represent these conditions. [5] b) Copy and complete the truth table for the logic circuit described in part a). [4] A
B
C
1
1
1
1
1
1
1
1
1
1
1
1
working space
X
c) Write a logic statement that describes the following logic circuit.
[3]
A
B X C
Cambridge International AS & A Level Computer Science 9608 Paper 13 Q6 June 2015
106
457591_03_CI_AS & A_Level_CS_068-106.indd 106
26/04/19 7:27 AM
4
Processor fundamentals ★ ★ ★ ★ ★ ★
★ ★ ★ ★ ★ ★ ★ ★ ★
the basic Von Neumann model of a computer system the purpose and role of the registers PC, MDR, MAR, ACC, IX, CIR, and the status registers the purpose and role of the arithmetic logic unit (ALU), control unit (CU), system clock and immediate access store (IAS) functions of the address bus, data bus and control bus factors affecting computer performance (such as processor type, bus width, clock speeds, cache memory and use of core processors) the connection of computers to peripheral devices such as Universal Serial Bus (USB), high definition multimedia interface (HDMI) and Video Graphics Array (VGA) the fetch-execute cycle and register transfers the purpose of interrupts the relationship between assembly language and machine code (such as symbolic, absolute and relative addressing) different stages for a two-pass assembler tracing sample assembly language programming code assembly language instruction groups (such as data movement, I/O operations, arithmetic operations, comparisons, and so on) addressing modes (immediate, direct, indirect, indexed and relative) how to perform binary shifts (including logical, arithmetic, cyclic, left shift and right shift) how bit manipulation is used to monitor/control a device.
4.1 Central processing unit (CPU) architecture
In this chapter, you will learn about
4.1 Central processing unit (CPU) architecture WHAT YOU SHOULD ALREADY KNOW Try these four questions before you read the first part of this chapter. 1 a) Name the main components that make up a typical computer system. b) Tablets and smart phones carry out many of the functions of a desktop or laptop computer. Describe the main differences between the operations of a desktop or laptop computer and a tablet or phone.
2 When deciding on which computer, tablet or phone to buy, which are the main factors that determine your final choice? 3 Look at a number of computers, laptops and phones and list (and name) the types of input and output ports found on each device. 4 At the centre of all of the above electronic devices is the microprocessor. How has the development of the microprocessor changed over the last ten years? 107
457591_04_CI_AS & A_Level_CS_107-135.indd 107
25/04/19 9:07 AM
4 Processor fundamentals
4
Key terms Von Neumann architecture – computer architecture which introduced the concept of the stored program in the 1940s.
Core – a unit made up of ALU, control unit and registers which is part of a CPU. A CPU may contain a number of cores.
Arithmetic logic unit (ALU) – component in the processor which carries out all arithmetic and logical operations.
Dual core – a CPU containing two cores.
Control unit – ensures synchronisation of data flow and programs throughout the computer by sending out control signals along the control bus. System clock – produces timing signals on the control bus to ensure synchronisation takes place. Immediate access store (IAS) – holds all data and programs needed to be accessed by the control unit. Accumulator – temporary general purpose register which stores numerical values at any part of a given operation. Register – temporary component in the processor which can be general or specific in its use that holds data or instructions as part of the fetch-execute cycle. Status register – used when an instruction requires some form of arithmetic or logical processing. Flag – indicates the status of a bit in the status register, for example, N = 1 indicates the result of an addition gives a negative value. Address bus – carries the addresses throughout the computer system. Data bus – allows data to be carried from processor to memory (and vice versa) or to and from input/output devices. Control bus – carries signals from control unit to all other computer components. Unidirectional – used to describe a bus in which bits can travel in one direction only. Bidirectional – used to describe a bus in which bits can travel in both directions. Word – group of bits used by a computer to represent a single unit. Clock cycle – clock speeds are measured in terms of GHz; this is the vibrational frequency of the clock which sends out pulses along the control bus – a 3.5 GHZ clock cycle means 3.5 billion clock cycles a second. Overclocking – changing the clock speed of a system clock to a value higher than the factory/recommended setting. BIOS – basic input/output system.
Port – external connection to a computer which allows it to communicate with various peripheral devices. A number of different port technologies exist. Universal Serial Bus (USB) – a type of port connecting devices to a computer. Asynchronous serial data transmission – serial refers to a single wire being used to transmit bits of data one after the other. Asynchronous refers to a sender using its own clock/timer device rather sharing the same clock/timer with the recipient device. High-definition multimedia interface (HDMI) – type of port connecting devices to a computer. Video Graphics Array (VGA) – type of port connecting devices to a computer. High-bandwidth digital copy protection (HDCP) – part of HDMI technology which reduces risk of piracy of software and multimedia. Fetch-execute cycle – a cycle in which instructions and data are fetched from memory and then decoded and finally executed. Program counter (PC) – a register used in a computer to store the address of the instruction which is currently being executed. Current instruction register – a register used to contain the instruction which is currently being executed or decoded. Register Transfer Notation (RTN) – short hand notation to show movement of data and instructions in a processor, can be used to represent the operation of the fetch-execute cycle. Interrupt – signal sent from a device or software to a processor requesting its attention; the processor suspends all operations until the interrupt has been serviced. Interrupt priority – all interrupts are given a priority so that the processor knows which need to be serviced first and which interrupts are to be dealt with quickly. Interrupt service routine (ISR) or interrupt handler– software which handles interrupt requests (such as ‘printer out of paper’) and sends the request to the CPU for processing.
Cache memory – a high speed auxiliary memory which permits high speed data transfer and retrieval.
Quad core – a CPU containing four cores.
108
457591_04_CI_AS & A_Level_CS_107-135.indd 108
25/04/19 9:07 AM
4.1.1 Von Neumann model Early computers were fed data while the machines were running. It was not possible to store programs or data; that meant they could not operate without considerable human intervention.
» » » »
a central processing unit (CPU or processor) a processor able to access the memory directly computer memories that could store programs as well as data stored programs made up of instructions that could be executed in sequential order.
Figure 4.1 shows a simple representation of Von Neumann architecture. address bus
data bus
memory address register (MAR) memory data register (MDR)
program counter (PC)
4.1 Central processing unit (CPU) architecture
In the mid-1940s, John Von Neumann developed the concept of the stored program computer. It has been the basis of computer architecture for many years. The main, previously unavailable, features of the Von Neumann architecture were
4
system clock control bus
CONTROL UNIT (CU)
ARITHMETIC AND LOGIC UNIT (ALU)
current instruction register (CIR) accumulator (ACC) status registers (SR)
CPU
▲ Figure 4.1 Representation of Von Neumann architecture
4.1.2 Components of the processor (CPU) The main components of the processor are the arithmetic logic unit (ALU), the control unit (CU), the system clock and the immediate access store (IAS).
Arithmetic logic unit (ALU) The ALU allows the required arithmetic or logic operations to be carried out while a program is being run. It is possible for a computer to have more than one ALU – one will perform fixed point operations and the other floating-point operations (see Chapter 13). Multiplication and division are carried out by a sequence of addition, subtraction and left/right shifting operations (for example, shifting 0 0 1 1 0 1 1 1 two places to the left gives 1 1 0 1 1 1 0 0, which is equivalent to multiplying by a factor of 4).
109
457591_04_CI_AS & A_Level_CS_107-135.indd 109
25/04/19 9:07 AM
4 Processor fundamentals
4
The accumulator (ACC) is a temporary register used when carrying out ALU calculations.
Control unit (CU) The CU reads an instruction from memory (the address of the location where the instruction can be found is stored in the program counter (PC)). This instruction is then interpreted. During that process, signals are generated along the control bus to tell the other components in the computer what to do. The CU ensures synchronisation of data flow and program instructions throughout the computer. System clock A system clock is used to produce timing signals on the control bus to ensure this vital synchronisation takes place – without the clock the computer would simply crash. (See Section 4.1.4 System buses.) Immediate access store (IAS) The IAS holds all the data and programs that the processor (CPU) needs to access. The CPU takes data and programs held in backing store and puts them into the IAS temporarily. This is done because read/write operations carried out using the IAS are considerably faster than read/write operations to backing store. Consequently, any key data needed by an application will be stored temporarily in IAS to speed up operations. The IAS is another name for primary (RAM) memory.
4.1.3 Registers One of the most fundamental components of the Von Neumann system is the register. Registers can be general purpose or special purpose. General purpose registers hold data that is frequently used by the CPU or can be used by the programmer when addressing the CPU directly. The accumulator is a good example of a general purpose register and will be used as such throughout this book. Special purpose registers have a specific function within the CPU and hold the program state. The most common special registers referred to in this book are shown in Table 4.1. The use of many of these registers is explained more fully in Section 4.1.6 (fetch-execute cycle) and in Section 4.2 (tracing of assembly code programs).
Register
Abbreviation
Function/purpose of register
current instruction CIR register
stores the current instruction being decoded and executed
index register
IX
used when carrying out index addressing operations (assembly code)
memory address register
MAR
stores the address of the memory location currently being read from or written to
memory data/ buffer register
MDR/MBR
stores data which has just been read from memory or data which is about to be written to memory (sometimes referred to as MBR)
program counter
PC
stores the address where the next instruction to be read can be found
status register
SR
contain bits which can be set or cleared depending on the operation (for example, to indicate overflow in a calculation)
▲ Table 4.1 Common registers
110
457591_04_CI_AS & A_Level_CS_107-135.indd 110
25/04/19 9:07 AM
All of the registers listed in Table 4.1 (apart from status and index registers) are used in the fetch-execute cycle, which is covered later in this chapter. Index registers are best explained when looking at addressing techniques in assembly code (again, this is covered later in the chapter).
4
A status register is used when an instruction requires some form of arithmetic or logic processing. Each bit is known as a flag. Most systems have the following four flags.
Consider this arithmetic operation: +
01110111 00111000 10101111
Flags: NVCZ 1100
Since we have two positive numbers being added, the answer should not be negative. The flags indicate two errors: a negative result, and an overflow occurred.
4.1 Central processing unit (CPU) architecture
» Carry flag (C) is set to 1 if there is a CARRY following an addition operation (refer to Chapter 1). » Negative flag (N) is set to 1 if the result of a calculation yields a NEGATIVE value. » Overflow flag (V) is set to 1 if an arithmetic operation results in an OVERFLOW being produced. » Zero flag (Z) is set to 1 if the result of an arithmetic or logic operation is ZERO.
Now consider this operation: +
10001000 11000111 101001111
Flags: NVCZ 0110
Since we have two negative numbers being added, the answer should be negative. The flags indicate that two errors have occurred: a carry has been generated, and a ninth bit overflow has occurred. Other flags can be generated, such as a parity flag, an interrupt flag or a halfcarry flag.
EXTENSION ACTIVITY 4A Find out what conditions could cause: a) a parity flag (P) being set to 1 b) an interrupt flag (I) being set to 1 c) a zero flag (Z) being set to 1 d) a half-carry flag (H) being set to 1.
111
457591_04_CI_AS & A_Level_CS_107-135.indd 111
25/04/19 9:07 AM
4
4.1.4 System buses CPU
memory
input/output ports
4 Processor fundamentals
control bus
address bus
data bus system bus
▲ Figure 4.2 System buses
(System) buses are used in computers as a parallel transmission component; each wire in the bus transmits one bit of data. There are three common buses used in the Von Neumann architecture known as address bus, data bus and control bus.
Address bus As the name suggests, the address bus carries addresses throughout the computer system. Between the CPU and memory the address bus is unidirectional (in other words, bits can travel in one direction only). This prevents addresses being carried back to the CPU, which would be undesirable. The width of a bus is important. The wider the bus, the more memory locations which can be directly addressed at any given time; for example, a bus of width 16 bits can address 216 (65 536) memory locations, whereas a bus width of 32 bits allows 4 294 967 296 memory locations to be simultaneously addressed. Even this is not large enough for modern computers, but the technology behind even wider buses is outside the scope of this book.
Data bus The data bus is bidirectional (in other words, it allows data to be sent in both directions along the bus). This means data can be carried from CPU to memory (and vice versa) as well as to and from input/output devices. It is important to point out that data can be an address, an instruction or a numerical value. As with the address bus, the width of the data bus is important: the wider the bus, the larger the word length that can be transported. (A word is a group of bits which can be regarded as a single unit, for example, 16-bit, 32-bit or 64-bit word lengths are the most common). Larger word lengths can improve the computer’s overall performance.
Control bus The control bus is also bidirectional. It carries signals from the CU to all the other computer components. It is usually 8-bits wide since it only carries control signals.
112
457591_04_CI_AS & A_Level_CS_107-135.indd 112
25/04/19 9:07 AM
1 Width of the address bus and data bus can affect computer performance. 2 Overclocking: the clock speed can be changed by accessing the basic input/output system (BIOS) and altering the settings. However, using a clock speed higher than the computer was designed for can lead to problems, such as – execution of instructions outside design limits, which can lead to seriously unsynchronised operations (in other words, an instruction is unable to complete in time before the next one is due to be executed) and the computer would frequently crash and become unstable – serious overheating of the CPU leading to unreliable performance. 3 The use of cache memory can also improve processor performance. It is similar to RAM in that its contents are lost when the power is turned off. Cache uses SRAM (see Chapter 3) whereas most computers use DRAM for main memory. Therefore, cache memories will have faster access times, since there is no need to keep refreshing, which slows down access time. When a processor reads memory, it first checks out cache and then moves on to main memory if the required data is not there. Cache memory stores frequently used instructions and data that need to be accessed faster. This improves processor performance. 4 The use of a different number of cores (one core is made up of an ALU, a CU and the registers) can improve computer performance. Many computers are dual core (the CPU is made up of two cores) or quad core (the CPU is made up of four cores). The idea of using more cores alleviates the need to continually increase clock speeds. However, doubling the number of cores does not necessarily double the computer’s performance since we have to take into account the need for the CPU to communicate with each core; this will reduce overall performance. For example – dual core has one channel and needs the CPU to communicate with both cores, reducing some of the potential increase in its performance – quad core has six channels and needs the CPU to communicate with all four cores, considerably reducing potential performance.
core 1
core 2
core 1
core 2
core 3
core 4
4 4.1 Central processing unit (CPU) architecture
It is worth mentioning here the role of the system clock. The clock defines the clock cycle which synchronises all computer operations. As mentioned earlier, the control bus transmits timing signals, ensuring everything is fully synchronised. By increasing clock speed, the processing speed of the computer is also increased (a typical current value is 3.5 GHz – which means 3.5 billion clock cycles a second). Although the speed of the computer may have been increased, it is not possible to say that a computer’s overall performance is necessarily increased by using a higher clock speed. Four other factors need to be considered.
▲ Figure 4.3 Two cores, one channel (left) and four cores, six channels (right)
All of these factors need to be taken into account when considering computer performance.
113
457591_04_CI_AS & A_Level_CS_107-135.indd 113
25/04/19 9:07 AM
4 Processor fundamentals
4
In summary » increasing bus width (data and address buses) increases the performance and speed of a computer system » increasing clock speed usually increases the speed of a computer » a computer’s performance can be changed by altering bus width, clock speed and use of multi-core CPUs » use of cache memories can also speed up a processor’s performance.
4.1.5 Computer ports Input and output devices are connected to a computer via ports. The interaction of the ports with connected input and output is controlled by the control unit. Here we will summarise some of the more common types of ports found on modern computers.
▲ Figure 4.4 (from left to right) USB cable, HDMI cable, VGA cable
USB ports The Universal Serial Bus (USB) is an asynchronous serial data transmission method. It has quickly become the standard method for transferring data between a computer and a number of devices. The USB cable consists of a four-wired shielded cable, with two wires for power and the earth, and two wires used for data transmission. When a device is plugged into a computer using one of the USB ports
» the computer automatically detects that a device is present (this is due to a small change in the voltage level on the data signal wires in the cable) » the device is automatically recognised, and the appropriate device driver is loaded up so that computer and device can communicate effectively » if a new device is detected, the computer will look for the device driver which matches the device. If this is not available, the user is prompted to download the appropriate software.
114
457591_04_CI_AS & A_Level_CS_107-135.indd 114
25/04/19 9:07 AM
The USB system has become the industry standard, but there are still pros and cons to using this system, as summarised in Table 4.2. Pros of USB system
Cons of USB system
n
n
n
n n
the present transmission rate is limited to less than 500 megabits per second n the maximum cable length is presently about five metres n the older USB standard (such as 1.1) may not be supported in the near future
▲ Table 4.2 Pros and cons of the USB system
High-definition multimedia interface (HDMI) High-definition multimedia interface (HDMI) ports allow output (both audio and visual) from a computer to an HDMI-enabled device. They support highdefinition signals (enhanced or standard). HDMI was introduced as a digital replacement for the older Video Graphics Array (VGA) analogue system. Modern HD (high definition) televisions have the following features, which are making VGA a redundant technology:
4.1 Central processing unit (CPU) architecture
n
devices plugged into the computer are automatically detected and device drivers are automatically loaded up the connectors can only fit one way, which prevents incorrect connections being made this has become the industry standard, which means that considerable support is available to users several different data transmission rates are supported newer USB standards are backward compatible with older USB standards
4
» They use a widescreen format (16:9 aspect ratio). » The screens use a greater number of pixels (typically 1920 × 1080). » The screens have a faster refresh rate (such as 120 Hz or 120 frames a second). » The range of colours is extremely large (some companies claim up to four million different colour variations).
This means that modern HD televisions require more data, which has to be received at a much faster rate than with older televisions (around 10 gigabits per second). HDMI increases the bandwidth, making it possible to supply the necessary data for high quality sound and visual effects. HDMI can also afford some protection against piracy since it uses high-bandwidth digital copy protection (HDCP). HDCP uses a type of authentication protocol (see Chapters 6 and 17). For example, a Blu-ray player will check the authentication key of the device it is sending data to (such as an HD television). If the key can be authenticated, then handshaking takes place and the Blu-ray can start to transmit data to the connected device.
Video Graphics Array (VGA) VGA was introduced at the end of the 1980s. VGA supports 640 × 480 pixel resolution on a television or monitor screen. It can also handle a refresh rate of up to 60 Hz (60 frames a second) provided there are only 16 different colours being used. If the pixel density is reduced to 200 × 320, then it can support up to 256 colours.
115
457591_04_CI_AS & A_Level_CS_107-135.indd 115
25/04/19 9:07 AM
4 Processor fundamentals
4
The technology is analogue and, as mentioned in the previous section, is being phased out. Table 4.3 summarises the pros and cons of HDMI and VGA. Pros of HDMI
Cons of HDMI
n
the current standard for modern televisions and monitors n allows for a very fast data transfer rate n improved security (helps prevent piracy) n supports modern digital systems
n
Pros of VGA
Cons of VGA
n
n
n
n
simpler technology only one standard available n it is easy to split the signal and connect a number of devices from one source n the connection is very secure
not a very robust connection (easy to break connection when simply moving device) n limited cable length to retain good signal n there are currently five cable/ connection standards old out-dated analogue technology it is easy to bend the pins when making connections n the cables must be of a very high grade to ensure good undistorted signal
▲ Table 4.3 Pros and cons of HDMI and VGA
4.1.6 Fetch-execute cycle We have already considered the role of buses and registers in the processor. This next section shows how an instruction is decoded and executed in the fetch-execute cycle using various components in the processor. To execute a set of instructions, the processor first fetches data and instructions from memory and stores them in suitable registers. Both the address bus and data bus are used in this process. Once this is done, each instruction needs to be decoded before being executed.
Fetch The next instruction is fetched from the memory address currently stored in the program counter (PC) and is then stored in the current instruction register (CIR). The PC is then incremented (increased by 1) so that the next instruction can be processed. This is decoded so that each instruction can be interpreted in the next part of the cycle. Execute The processor passes the decoded instruction as a set of control signals to the appropriate components within the computer system. This allows each instruction to be carried out in its logical sequence.
Figure 4.5 shows how the fetch-execute cycle is carried out in the Von Neumann computer model.
116
457591_04_CI_AS & A_Level_CS_107-135.indd 116
25/04/19 9:07 AM
START
4
any instructions?
no
yes
this address is then copied from the PC to the memory address register (MAR) using the address bus
the contents (instruction) at the memory location (address) contained in MAR are then copied temporarily into the memory data register (MDR)
service the interrupt
the contents (instruction) of the MDR are then copied and placed into the current instruction register (CIR)
the value in the PC is then incremented by one so that it now points to the next instruction which has to be fetched
4.1 Central processing unit (CPU) architecture
the program counter (PC) contains the address of the memory location of the next instruction which has to be fetched
the instruction is finally decoded and then executed by sending out signals (via the control bus) to the various components of the computer system
yes
any interrupts to service?
no
▲ Figure 4.5 How the fetch-execute cycle is carried out in the Von Neumann computer model
When registers are involved, it is possible to describe what is happening by using Register Transfer Notation (RTN). In its simplest form: MAR ← [PC]
contents of PC copied into MAR
PC ← [PC] + 1 PC is incremented by 1 MDR ← [[MAR]] data stored at address shown in MAR is copied into MDR CIR ← [MDR]
contents of MDR copied into CIR
Double brackets are used in the third line because it is not MAR contents being copied into MDR but it is the data stored at the address shown in MAR that is being copied to MDR. Compare the above instructions to those shown in Figure 4.5. Inspection should show the register transfer notation is carrying out the same function.
117
457591_04_CI_AS & A_Level_CS_107-135.indd 117
25/04/19 9:07 AM
4
RTN can be abstract (generic notation – as shown on page 117) or concrete (specific to a particular machine – example shown below). For example, on a RISC computer: instruction _ interpretation := (¬Run/Start → Run ← 1; instruction _ interpretation):
4 Processor fundamentals
Run → (CIR ← M[PC]:PC ← PC + 4; instruction _ execution)
Use of interrupts in the fetch-execute cycle Section 4.1.7 gives a general overview of how a computer uses interrupts to allow a computer to operate efficiently and to allow it, for example, to carry out multi-tasking functions. Just before we discuss interrupts in this general fashion, the following notes explain how interrupts are specifically used in the fetch-execute cycle. A special register called the interrupt register is used in the fetch-execute cycle. While the CPU is in the middle of carrying out this cycle, an interrupt could occur, which will cause one of the bits in the interrupt register to change its status. For example, the initial status might be 0000 0000 and a fault might occur while writing data to the hard drive; this would cause the register to change to 0000 1000. The following sequence now takes place. » At the next fetch-execute cycle, the interrupt register is checked bit by bit. » The contents 0000 1000 would indicate an interrupt occurred during a previous cycle and it still needs servicing. The CPU would now service this interrupt or ignore it for now, depending on its priority. » Once the interrupt is serviced by the CPU, it stops its current task and stores the contents of its registers (see Section 4.1.7 for more details about how this is done). » Control is now transferred to the interrupt handler (or interrupt service routine, ISR). » Once the interrupt is fully serviced, the register is reset and the contents of registers are restored.
Figure 4.6 summarises the interrupt process during the fetch-execute cycle. FETCH stage in the cycle
EXECUTE stage in the cycle
INTERRUPT stage in the cycle
decode and execute the instruction
execute interrupt service routine (ISR)
end of the program?
suspend execution of current program/task
fetch the next instructions
START
END Yes
interrupts are disabled
No
No
any interrupts?
Yes
interrupts enabled and interrupt priority checked
▲ Figure 4.6 The interrupt process during the fetch-execute cycle
118
457591_04_CI_AS & A_Level_CS_107-135.indd 118
4/30/19 7:49 AM
4.1.7 Interrupts An interrupt is a signal sent from a device or from software to the processor. This will cause the processor to temporarily stop what it is doing and service the interrupt. Interrupts can be caused by, for example
Once the interrupt signal is received, the processor either carries on with what it was doing or stops to service the device/program that generated the interrupt. The computer needs to identify the interrupt type and also establish the level of interrupt priority. Interrupts allow computers to carry out many tasks or to have several windows open at the same time. An example would be downloading a file from the internet at the same time as listening to some music from the computer library. Whenever an interrupt is serviced, the status of the current task being run is saved. The contents of the program counter and other registers are saved. Then, the interrupt service routine (ISR) is executed by loading the start address into the program counter. Once the interrupt has been fully serviced, the status of the interrupted task is reinstated (contents of saved registers retrieved) and it continues from the point prior to the interrupt being sent.
4.1 Central processing unit (CPU) architecture
» a timing signal » input/output processes (a disk drive is ready to receive more data, for example) » a hardware fault (an error has occurred such as a paper jam in a printer, for example) » user interaction (the user pressed a key to interrupt the current process, such as , for example) » a software error that cannot be ignored (if an .exe file could not be found to initiate the execution of a program OR an attempt to divide by zero, for example).
4
ACTIVITY 4A 1 a) Describe the functions of the following registers. i) Current instruction register ii) Memory address register iii) Program counter b) Status registers contain flags. Three such flags are named N, C and V. i) What does each of the three flags represent? ii) Give an example of the use of each of the three flags. 2 a) Name three buses used in the Von Neumann architecture. b) Describe the function of each named bus. c) Describe how bus width and clock speed can affect computer performance.
119
457591_04_CI_AS & A_Level_CS_107-135.indd 119
25/04/19 9:07 AM
4
3 Copy the diagram below and connect each feature to the correct port, HDMI or VGA. Type of port
Feature analogue interface
4 Processor fundamentals
can handle maximum refresh rate of 60 GHz HDMI digital interface can give additional protection from piracy VGA easier to split the signal can support refresh rate up to 120 GHz
4 a) What is meant by the fetch-execute cycle? b) Using register transfer notation, show the main stages in a typical fetch-execute cycle. 5 Copy and complete this paragraph by using terms from this chapter. data and instructions required for The processor an application and temporarily stores them in the until they can be processed. is used to hold the address of the The next instruction to be executed. This address is copied to the using the . The contents at this address are stored in the . and finally Each instruction is then sending out using the . Any calculations carried out are done using the . During any calculations, data is temporarily held . in a special register known as the
120
457591_04_CI_AS & A_Level_CS_107-135.indd 120
25/04/19 9:07 AM
4.2 Assembly language WHAT YOU SHOULD ALREADY KNOW c) Why do programmers find writing in this type of programming language difficult? 2 Find at least two different types of CPU and the language they use. 3 Look at your computer and/or laptop and/or phone and list the programming language(s) they use.
Key terms Machine code – the programming language that the CPU uses. Instruction – a single operation performed by a CPU. Assembly language – a low-level chip/machine specific programming language that uses mnemonics. Opcode – short for operation code, the part of a machine code instruction that identifies the action the CPU will perform. Operand – the part of a machine code instruction that identifies the data to be used by the CPU. Source code – a computer program before translation into machine code. Assembler – a computer program that translates programming code written in assembly language into machine code. Assemblers can be one pass or two pass. Instruction set – the complete set of machine code instructions used by a CPU. Object code – a computer program after translation into machine code. Addressing modes – different methods of using the operand part of a machine code instruction as a memory address.
4.2 Assembly language
Try these three questions before you start the second part of this chapter. 1 a) Name two types of low-level programming language. b) Name the only type of programming language that a CPU recognises.
4
Absolute addressing – mode of addressing in which the contents of the memory location in the operand are used. Direct addressing – mode of addressing in which the contents of the memory location in the operand are used, which is the same as absolute addressing. Indirect addressing – mode of addressing in which the contents of the contents of the memory location in the operand are used. Indexed addressing – mode of addressing in which the contents of the memory location found by adding the contents of the index register (IR) to the address of the memory location in the operand are used. Immediate addressing – mode of addressing in which the value of the operand only is used. Relative addressing – mode of addressing in which the memory address used is the current memory address added to the operand. Symbolic addressing – mode of addressing used in assembly language programming, where a label is used instead of a value.
4.2.1 Assembly language and machine code The only programming language that a CPU can use is machine code. Every different type of computer/chip has its own set of machine code instructions. A computer program stored in main memory is a series of machine code instructions that the CPU can automatically carry out during the fetch-execute cycle. Each machine code instruction performs one simple task, for example, storing a value in a memory location at a specified address. Machine code is binary, it is sometimes displayed on a screen as hexadecimal so that human programmers can understand machine code instructions more easily.
121
457591_04_CI_AS & A_Level_CS_107-135.indd 121
25/04/19 9:07 AM
4 Processor fundamentals
4
Writing programs in machine code is a specialised task that is very time consuming and often error prone, as the only way to test a program written in machine code is to run it and see what happens. In order to shorten the development time for writing computer programs, other programming languages were developed, where the instructions were easier to learn and understand. Any program not written in machine code needs to be translated before the CPU can carry out the instructions, so language translators were developed. The first programming language to be developed was assembly language, this is closely related to machine code and uses mnemonics instead of binary. LDD Total ADD 20
0140 0214
00000000110000000 00000001000011000
STO Total
0340
00000001110000000
Assembly language mnemonics
Machine code hexadecimal
Machine code binary
The structure of assembly language and machine code instructions is the same. Each instruction has an opcode that identifies the operation to be carried out by the CPU. Most instructions also have an operand that identifies the data to be used by the opcode. Operand
Opcode
Opcode
Operand
LDD Total
0140
Assembly language mnemonics
Machine code hexadecimal
4.2.2 Stages of assembly Before a program written in assembly language (source code) can be executed, it needs to be translated into machine code. The translation is performed by a program called an assembler. An assembler translates each assembly language instruction into a machine code instruction. An assembler also checks the syntax of the assembly language program to ensure that only opcodes from the appropriate machine code instruction set are used. This speeds up the development time, as some errors are identified during translation before the program is executed. There are two types of assembler: single pass assemblers and two pass assemblers. A single pass assembler puts the machine code instructions straight into the computer memory to be executed. A two pass assembler produces an object program in machine code that can be stored, loaded then executed at a later stage. This requires the use of another program called a loader. Two pass assemblers need to scan the source program twice, so they can replace labels in the assembly program with memory addresses in the machine code program. Label
Memory address 0140
Assembly language mnemonics
Machine code hexadecimal
LDD Total
122
457591_04_CI_AS & A_Level_CS_107-135.indd 122
25/04/19 9:07 AM
Pass 1 » » » » » »
Read the assembly language program one line at a time. Ignore anything not required, such as comments. Allocate a memory address for the line of code. Check the opcode is in the instruction set. Add any new labels to the symbol table with the address, if known. Place address of labelled instruction in the symbol table.
» Read the assembly language program one line at a time. » Generate object code, including opcode and operand, from the symbol table generated in Pass 1. » Save or execute the program.
The second pass is required as some labels may be referred to before their address is known. For example, Found is a forward reference for the JPN instruction. Label Notfound:
Found:
Opcode LDD CMP
Operand 200 #0
JPN
Found
JPE
Notfound
4.2 Assembly language
Pass 2
4
OUT
If the program is to be loaded at memory address 100, and each memory location contains 16 bits, the symbol table for this small section of program would look like this: Label Notfound Found
Address 100 104
4.2.3 Assembly language instructions There are different types of assembly language instructions. Examples of each type are given below.
Data movement instructions These instructions allow data stored at one location to be copied into the accumulator. This data can then be stored at another location, used in a calculation, used for a comparison or output.
123
457591_04_CI_AS & A_Level_CS_107-135.indd 123
25/04/19 9:07 AM
4
Instruction Opcode LDM LDD
4 Processor fundamentals
LDI
LDX
LDR LDR MOV STO END
Explanation
Operand #n
Load the number into ACC (immediate addressing is used) Load the contents of the specified address into ACC (direct or absolute addressing is used) The address to be used is the contents of the specified address. Load the contents of the contents of the given address into ACC (indirect addressing is used) The address to be used is the specified address plus the contents of the index register. Load the contents of this calculated address into ACC (indexed addressing is used) #n Load the number n into IX (immediate addressing is used) ACC Load the number in the accumulator into IX Move the contents of the accumulator to the register (IX) Store the contents of ACC into the specified address (direct or absolute addressing is used) Return control to the operating system
ACC is the single accumulator IX is the Index Register All numbers are denary unless identified as binary or hexadecimal B is a binary number, for example B01000011 & is a hexadecimal number, for example &7B # is a denary number ▲ Table 4.4 Data movement instructions
Input and output of data instructions These instructions allow data to be read from the keyboard or output to the screen. Instruction Opcode Operand IN OUT
Explanation
Key in a character and store its ASCII value in ACC Output to the screen the character whose ASCII value is stored in ACC No opcode is required as a single character is either input to the accumulator or output from the accumulator ▲ Table 4.5 Input and output of data instructions
Arithmetic operation instructions These instructions perform simple calculations on data stored in the accumulator and store the answer in the accumulator, overwriting the original data. Instruction Opcode ADD
Explanation
Operand
Add the contents of the specified address to the ACC (direct or absolute addressing is used) ADD #n Add the denary number n to the ACC SUB
Subtract the contents of the specified address from the ACC SUB #n Subtract the number n from the ACC INC Add 1 to the contents of the register (ACC or IX) DEC Subtract 1 from the contents of the register (ACC or IX) Answers to calculations are always stored in the accumulator ▲ Table 4.6 Arithmetic operation instructions
124
457591_04_CI_AS & A_Level_CS_107-135.indd 124
25/04/19 9:07 AM
Unconditional and conditional instructions Instruction
Explanation
Opcode JMP
Operand
JPE
Following a compare instruction, jump to the specified address if the comparison is True
JPN
Following a compare instruction, jump to the specified address if the comparison is False
Jump to the specified address
Returns control to the operating system
Jump means change the PC to the address specified, so the next instruction to be executed is the one stored at the specified address, not the one stored at the next location in memory ▲ Table 4.7 Unconditional and conditional instructions
Compare instructions Instruction
4.2 Assembly language
END
4
Explanation
Opcode CMP
Operand
CMP
#n
Compare the contents of ACC with the number n
CMI
The address to be used is the contents of the specified address; compare the contents of the contents of the given address with ACC (indirect addressing is used)
Compare the contents of ACC with the contents of the specified address (direct or absolute addressing is used)
The contents of the accumulator are always compared ▲ Table 4.8 Compare instructions
4.2.4 Addressing modes Assembly language and machine code programs use different addressing modes depending on the requirements of the program. Absolute addressing – the contents of the memory location in the operand are used. For example, if the memory location with address 200 contained the value 20, the assembly language instruction LDD 200 would store 20 in the accumulator. Direct addressing – the contents of the memory location in the operand are used. For example, if the memory location with address 200 contained the value 20, the assembly language instruction LDD 200 would store 20 in the accumulator. Absolute and direct addressing are the same. Indirect addressing – the contents of the contents of the memory location in the operand are used. For example, if the memory location with address 200 contained the value 20 and the memory location with address 20 contained the value 5, the assembly language instruction LDI 200 would store 5 in the accumulator. Indexed addressing – the contents of the memory location found by adding the contents of the index register (IR) to the address of the memory location in the operand are used. For example, if IR contained the value 4 and memory location with address 204 contained the value 17, the assembly language instruction LDX 200 would store 17 in the accumulator. 125
457591_04_CI_AS & A_Level_CS_107-135.indd 125
25/04/19 9:07 AM
4 Processor fundamentals
4
Immediate addressing – the value of the operand only is used. For example, the assembly language instruction LDM #200 would store 200 in the accumulator. Relative addressing – the memory address used is the current memory address added to the operand. For example, JMR #5 would transfer control to the instruction 5 locations after the current instruction. Symbolic addressing – only used in assembly language programming. A label is used instead of a value. For example, if the memory location with address labelled MyStore contained the value 20, the assembly language instruction LDD MyStore would store 20 in the accumulator. Labels make it easier to alter assembly language programs because when absolute addresses are used every reference to that address needs to be edited if an extra instruction is added, for example. Label :
Instruction Opcode
:
Explanation
Operand
n
Labels an instruction Gives a symbolic address to the memory location with the contents n
▲ Table 4.9 Labels
4.2.5 Simple assembly language programs A program written in assembly language will need many more instructions than a program written in a high-level language to perform the same task. In a high-level language, adding three numbers together and storing the answer would typically be written as a single instruction: total = first + second + third The same task written in assembly language could look like this: Label start:
Opcode LDD ADD
Operand first second
ADD
third
STO
total
END first:
#20
second:
#30
third:
#40
total:
#0
If the program is to be loaded at memory address 100 after translation and each memory location contains 16 bits, the symbol table for this small section of program would look like this:
126
457591_04_CI_AS & A_Level_CS_107-135.indd 126
25/04/19 9:07 AM
Label start first second third total
Address 100 106 107 108 109
4
CIR
Opcode
Operand ACC
first 106
second 107 third 108
total 109
100
LDD
first
20
20
30
40
101
ADD
second 50
20
30
40
102
ADD
third
90
20
30
40
103
STO
total
90
20
30
40
90
104
END
4.2 Assembly language
When this section of code is executed, the contents of ACC, CIR and the variables used can be traced using a trace table.
In a high-level language, adding a list of numbers together and storing the answer would typically be written using a loop. FOR counter = 1 TO 3
total = total + number[counter]
NEXT counter
The same task written in assembly language would require the use of the index register (IX). The assembly language program could look like this: Label
loop:
Opcode LDM
Operand #0
Comment
STO
total
Store 0 in total
STO
counter
Store 0 in counter
LDR
#0
Set IX to 0
LDX
number
Load the number indexed by IX into ACC
ADD
total
Add total to ACC
STO
total
Store result in total
INC
IX
Add 1 to the contents of IX
LDD
counter
Load counter into ACC
INC
ACC
Add 1 to ACC
STO
counter
Store result in counter
CMP
#3
Compare with 3
JPN
loop
If ACC not equal to 3 then return to start of loop
Load 0 into ACC
END number:
#5
List of three numbers
#7 #3
457591_04_CI_AS & A_Level_CS_107-135.indd 127
counter:
counter for loop
total:
Storage space for total
127
25/04/19 9:07 AM
Label loop number counter total
Address 104 115 118 119
When this section of code is executed the contents of ACC, CIR, IX and the variables used can be traced using a trace table: CIR 100
Opcode LDM
Operand #0
ACC 0
101
STO
total
102
STO
counter
103 104 105 106 107 108 109 110 111 112 104 105 106 107 108 109 110 111 112 104 105 106 107 108 109 110 111 112 113
LDR LDX ADD STO INC LDD INC STO CMP JPN LDX ADD STO INC LDD INC STO CMP JPN LDX ADD STO INC LDD INC STO CMP JPN END
#0 number total total IX counter ACC counter #3 loop number total total IX counter ACC counter #3 loop number total total IX counter ACC counter #3 loop
0 5 5 5 5 0 1 1 1 1 7 12 12 12 1 2 2 2 2 3 15 15 15 2 3 3 3 3
IX
Counter 118
Total 119 0
0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3
0 0 0 5 5 5 5 5 5 5 5 5 12 12 12 12 12 12 12 12 12 15 15 15 15 15 15 15
4 Processor fundamentals
4
If the program is to be loaded at memory address 100 after translation and each memory location contains 16 bits, the symbol table for this small section of program would look like this:
128
457591_04_CI_AS & A_Level_CS_107-135.indd 128
25/04/19 9:07 AM
ACTIVITY 4B
Label Opcode LDD
Operand number1
SUB
number2
ADD
number3
CMP
#10
JPE
nomore
ADD
number4
nomore: STO END
4 4.2 Assembly language
1 a) State the contents of the accumulator after the following instructions have been executed. The memory location with address 200 contains 300, the memory location with address 300 contains 50. i) LDM #200 ii) LDD 200 iii) LDI 200 b) Write an assembly language instruction to: i) compare the accumulator with 5 ii) jump to address 100 if the comparison is true. 2 a) Copy and complete the symbol table for this assembly language program. Assume that the translated program will start at memory address 100. b) Complete a trace table to show the execution of this assembly language program. c) State the task that this assembly language program performs.
total
number1: #30 number2: #40 number3: #20 number4: #50 total: #0 3 a) Using the assembly language instructions given in this section, write an assembly language program to output the ASCII value of each element of an array of four elements. b) Complete the symbol table for your assembly language program. Assume that the translated program will start at memory address 100. c) Complete a trace table to show the execution of your assembly language program.
129
457591_04_CI_AS & A_Level_CS_107-135.indd 129
25/04/19 9:07 AM
4
4.3 Bit manipulation WHAT YOU SHOULD ALREADY KNOW Try these two questions before you start the third part of this chapter. 1) Copy and complete the truth table for AND, OR and XOR.
4 Processor fundamentals
AND 0
1
1
1
1
OR
XOR
2) Identify three different types of shift used in computer programming.
Key terms Shift – moving the bits stored in a register a given number of places within the register; there are different types of shift. Logical shift – bits shifted out of the register are replaced with zeros. Arithmetic shift – the sign of the number is preserved. Cyclic shift – no bits are lost, bits shifted out of one end of the register are introduced at the other end of the register. Left shift – bits are shifted to the left. Right shift – bits are shifted to the right. Monitor – to automatically take readings from a device. Control – to automatically take readings from a device, then use the data from those readings to adjust the device. Mask – a number that is used with the logical operators AND, OR or XOR to identify, remove or set a single bit or group of bits in an address or register.
4.3.1 Binary shifts A shift involves moving the bits stored in a register a given number of places within the register. Each bit within the register may be used for a different purpose. For example, in the IR each bit identifies a different interrupt. There are several different types of shift. Logical shift – bits shifted out of the register are replaced with zeros. For example, an 8-bit register containing the binary value 10101111 shifted left logically three places would become 01111000.
Arithmetic shift – the sign of the number is preserved. For example, an 8-bit register containing the binary value 10101111 shifted right arithmetically three places would become 11110101. Arithmetic shifts can be used for multiplication or division by powers of two. Cyclic shift – no bits are lost during a shift. Bits shifted out of one end of the register are introduced at the other end of the register. For example, an 8-bit register containing the binary value 10101111 shifted left cyclically three places would become 01111101.
130
457591_04_CI_AS & A_Level_CS_107-135.indd 130
25/04/19 9:07 AM
Left shift – bits are shifted to the left; gives the direction of shift for logical, arithmetic and cyclic shifts. Right shift – bits are shifted to the right; gives the direction of shift for logical, arithmetic and cyclic shifts.
4
Table 4.10 shows the logical shifts that you are expected to use in assembly language programming. Instruction Operand n
LSR
n
Bits in ACC are shifted logically n places to the left. Zeros are introduced on the right-hand end Bits in ACC are shifted logically n places to the right. Zeros are introduced on the left-hand end
Shifts are always performed on the ACC ▲ Table 4.10 Logical shifts in assembly language programming
4.3 Bit manipulation
Opcode LSL
Explanation
4.3.2 Bit manipulation used in monitoring and control In monitoring and control, each bit in a register or memory location can be used as a flag and would need to be tested, set or cleared separately. For example, a control system with eight different sensors would need to record when the data from each sensor had been processed. This could be shown using 8 different bits in the same memory location. » AND is used to check if the bit has been set. » OR is used to set the bit. » XOR is used to clear a bit that has been set.
Table 4.11 shows the instructions used to check, set and clear a single bit or group of bits. Instruction
Explanation
Opcode AND
Operand n
AND
Bitwise AND operation of the contents of ACC with the contents of
XOR
n
Bitwise XOR operation of the contents of ACC with the operand
XOR
Bitwise XOR operation of the contents of ACC with the contents of
OR
n
Bitwise OR operation of the contents of ACC with the operand
OR
Bitwise OR operation of the contents of ACC with the contents of
Bitwise AND operation of the contents of ACC with the operand
The results of logical bit manipulation are always stored in the ACC. can be an absolute address or a symbolic address. The operand is used as the mask to set or clear bits ▲ Table 4.11 Instructions used to check, set and clear a single bit or group of bits
131
457591_04_CI_AS & A_Level_CS_107-135.indd 131
25/04/19 9:07 AM
The assembly language code to test sensor 3 could be:
4 Processor fundamentals
4
Opcode LDD
Operand sensors
Comment
AND
#B100
Mask to select bit 3 only
CMP
#B100
Check if bit 3 is set
JPN
process
Jump to process routine if bit not set
LDD
sensors
Load sensors into ACC
XOR
#B100
Clear bit 3 as sensor 3 has been processed
Load content of sensors into ACC
ACTIVITY 4C 1 a) State the contents of the accumulator after the following instructions have been executed. The accumulator contains B00011001. i) LSL #4 ii) LSR #5 b) Write an assembly language instruction to: i) set bit 4 in the accumulator ii) clear bit 1 in the accumulator. 2 a) Describe the difference between arithmetic shifts and logical shifts. b) Explain, with the aid of examples, how a cyclic shift works. c) This register is shown before and after it has been shifted. Identify the type of shift that has taken place.
End of chapter questions
1
1
1
1
1
1
1
1 a) Write these six stages of the Von Neumann fetch-execute cycle in the correct order.[6] – instruction is copied from the MDR and is placed in the CIR – the instruction is executed – the instruction is decoded – the address contained in PC is copied to the MAR – the value in PC is incremented by 1 – instruction is copied from memory location in MAR and placed in MDR [2] [2] [2]
[2]
b) Explain how the following affect the performance of a computer system. i) Width of the data bus and address bus. ii) The clock speed. iii) Use of dual core or quad core processors. c) A student accessed the BIOS on their computer. They increased the clock speed from 2.5 GHz to 3.2 GHz. Explain the potential dangers in doing this.
132
457591_04_CI_AS & A_Level_CS_107-135.indd 132
25/04/19 9:07 AM
Not used
4 4.3 Bit manipulation
2 a) Explain the main differences between HDMI, VGA and USB ports when sending data to peripherals. [5] b) Describe how interrupts can be used to service a printer printing out a large 1000 page document. [5] [3] 3 a) i) Name three special registers used in a typical processor. ii) Explain the purpose of the three registers named in part i). [3] b) Explain how interrupts are used when a processor sends a document to a printer.[4] 4 A programmer is writing a program in assembly language. They need to use shift instructions. Describe, using examples, three types of shift instructions the programmer could use.[6] 5 An intruder detection system for a large house has four sensors. An 8-bit memory location stores the output from each sensor in its own bit position. The bit value for each sensor shows: – 1 – the sensor has been triggered – 0 – the sensor has not been triggered The bit positions are used as follows: Sensor 4 Sensor 3 Sensor 2 Sensor 1
The output from the intruder detection system is a loud alarm. a) i) State the name of the type of system to which intruder detection systems belong.[1] ii) Justify your answer to part i). [1] b) Name two sensors that could be used in this intruder detection system. Give a reason for your choice. [4] c) The intruder system is set up so that the alarm will only sound if two or more sensors have been triggered. An assembly language program has been written to process the contents of the memory location. ➔
133
457591_04_CI_AS & A_Level_CS_107-135.indd 133
25/04/19 9:07 AM
4 Processor fundamentals
4
This table shows part of the instruction set for the processor used. Instruction
Explanation
Opcode LDD
Operand
STO
Store the contents of ACC at the given address
INC
Add 1 to the contents of the register (ACC or IX)
ADD
Add the contents of the given address to the contents of ACC
AND
Bitwise AND operation of the contents of ACC with the contents of
CMP
#n
Compare the contents of ACC with the number n
JMP
Jump to the given address
JPE
Following a compare instruction, jump to if the compare was True
JGT
Following a compare instruction, jump to if the content of ACC is greater than the number used in the compare instruction
END
Direct addressing. Load the contents of the given address to ACC
End the program and return to the operating system
Part of the assembly code is: Opcode SENSORS:
Operand B00001010
COUNT:
VALUE:
1
LOOP:
ZERO:
SENSORS VALUE
CMP
#0
JPE
ZERO
LDD
COUNT
INC
ACC
STO
COUNT
LDD CMP
VALUE #8
JPE ADD STO JMP
EXIT VALUE VALUE LOOP
LDD CMP JGT
COUNT … ALARM
EXIT: TEST:
LDD AND
134
457591_04_CI_AS & A_Level_CS_107-135.indd 134
25/04/19 9:07 AM
i) Copy the table below and dry run the assembly language code. Start at LOOP and finish when EXIT is reached. BITREG B00001010
COUNT
VALUE
1
[4]
ACC
4 4.3 Bit manipulation
ii) The operand for the instruction labelled TEST is missing. State the missing operand. [1] iii) The intruder detection system is improved and now has eight sensors. One instruction in the assembly language code will need to be amended. Identify this instruction. Write the amended instruction. [2] Cambridge International AS & A Level Computer Science 9608 Paper 32 Q6 June 2016
135
457591_04_CI_AS & A_Level_CS_107-135.indd 135
25/04/19 9:07 AM
5 System software
5
System software In this chapter, you will learn about ★ ★
★
★
★ ★ ★ ★
why computers need an operating system key management tasks, such as memory management, file management, security management, hardware management and process management the need for utility software, including disk formatters, virus checkers, defragmentation software, disk content analyse and repair software, file compression and back-up software program libraries, software under development using program library software and the benefits to software developers, including the use of dynamic link library (DLL) files the need for these language translators: assemblers, compilers and interpreters the benefits and drawbacks of using compilers or interpreters an awareness that high level language programs may be partially compiled and partially interpreted (such as Javatm) the features of a typical integrated development environment (IDE) for – coding (using context-sensitive prompts) – initial error detection (including dynamic syntax checks) – presentation (including pretty print, expand and collapse code blocks) – debugging (for example, single stepping, use of breakpoints, variables/expressions report windows).
5.1 Operating systems WHAT YOU SHOULD ALREADY KNOW b) A manufacturer makes laptop computers, mobile phones and tablets. Explain why it is necessary for the manufacturer to develop different versions of its operating system for use on its computers, mobile phones and tablets. 3 Most operating systems offer a graphic user interface (GUI) as well as a command line interface (CLI). a) What are the main differences between the two types of interface?
Try these five questions before you read the first part of this chapter. 1 Microprocessors are commonly used to control microwave ovens, washing machines and many other household items. Explain why it is not necessary for these devices to have an operating system. 2 a) Name three of the most common operating systems used in computers and other devices, such as mobile phones and tablets.
136
457591_05_CI_AS & A_Level_CS_136-158.indd 136
25/04/19 9:27 AM
5 Describe the role of buffers and interrupts when a printing job is being sent to an inkjet printer. Consider the different operational speeds of a processor and a printer, together with size of printing job and interrupt priorities. Describe potential error scenarios – such as paper jam, out of paper or out of ink – and how these could affect the printing job.
Key terms CMOS – complementary metal-oxide semiconductor. Operating system – software that provides an environment in which applications can run and provides an interface between hardware and human operators. HCI – human–computer interface. GUI – graphical user interface. CLI – command line interface. Icon – small picture or symbol used to represent, for example, an application on a screen.
Hardware management – part of the operating system that controls all input/output devices connected to a computer (made up of sub-management systems such as printer management, secondary storage management, and so on). Device driver – software that communicates with the operating system and translates data into a format understood by the device.
WIMP – windows, icons, menu and pointing device.
Utility program – parts of the operating system which carry out certain functions, such as virus checking, defragmentation or hard disk formatting.
Post-WIMP – interfaces that go beyond WIMP and use touch screen technology rather than a pointing device.
Disk formatter – utility that prepares a disk to allow data/files to be stored and retrieved.
Pinching and rotating – actions by fingers on a touch screen to carry out tasks such as move, enlarge, reduce, and so on.
Bad sector – a faulty sector on an HDD which can be soft or hard.
Memory management – part of the operating system that controls the main memory. Memory optimisation – function of memory management that determines how memory is allocated and deallocated. Memory organisation – function of memory management that determines how much memory is allocated to an application. Security management – part of the operating system that ensures the integrity, confidentiality and availability of data. Contiguous – items next to each other. Virtual memory systems – memory management (part of OS) that makes use of hardware and software to enable a computer to compensate for shortage of actual physical memory. Memory protection – function of memory management that ensures two competing applications cannot use same memory locations at the same time. Process management – part of the operating system that involves allocation of resources and permits the sharing and exchange of data.
5 5.1 Operating systems
b) What are the pros and cons of both types of interface? c) Who would use each type of interface? 4 Before the advent of the operating system, computers relied on considerable human intervention. Find out the methods used to start up early computers to prepare them for the day’s tasks.
Antivirus software – software that quarantines and deletes files or programs infected by a virus (or other malware). It can be run in the background or initiated by the user. Heuristic checking – checking of software for behaviour that could indicate a possible virus. Quarantine – file or program identified as being infected by a virus which has been isolated by antivirus software before it is deleted at a later stage. False positive – a file or program identified by a virus checker as being infected but the user knows this cannot be correct. Disk defragmenter – utility that reorganises the sectors on a hard disk so that files can be stored in contiguous data blocks. Disk content analysis software – utility that checks disk drives for empty space and disk usage by reviewing files and folders. Disk compression – software that compresses data before storage on an HDD. Back-up utility – software that makes copies of files on another portable storage device.
137
457591_05_CI_AS & A_Level_CS_136-158.indd 137
25/04/19 9:27 AM
5
Program library – a library on a computer where programs and routines are stored which can be freely accessed by other software developers for use in their own programs.
5 System software
Library program – a program stored in a library for future use by other programmers.
Library routine – a tested and ready-to-use routine available in the development system of a programming language that can be incorporated into a program. Dynamic link file (DLL) – a library routine that can belinked to another program only at the run time stage.
5.1.1 The need for an operating system Early computers had no operating system at all. Control software had to be loaded each time the computer was started – this was done using either paper tape or punched cards. In the 1970s, the home computer was becoming increasingly popular. Early examples, such as the Acorn BBC B, used an internal ROM chip to store part of the operating system. A cassette tape machine was also used to load the remainder of the operational software (see Figure 5.1). This was necessary to ‘get the computer started’ and used a welcome cassette tape which had to be used each time the computer was turned on.
▲ Figure 5.1 An Acorn BBC B (left) and its cassette tape machine (right)
As the hard disk drive (HDD) was developed, operating systems were stored on the hard disk, and start-up of the motherboard was handled by the basic input/ output system (BIOS). Initially, the BIOS was stored on a ROM chip but, in modern computers, the BIOS contents are stored on a flash memory chip. The BIOS configuration is stored in CMOS memory (complementary metal-oxide semiconductor) which means it can be altered or deleted as required. The required part of the operating system is copied into RAM – since operating systems are now so large, it would seriously affect a computer’s performance if it was all loaded into RAM at once. An operating system provides both the environment in which applications can be run, and a useable interface between humans and computer. An operating system also disguises the complexity of computer hardware. Common examples include Microsoft Windows®, Apple MacOS, Google Android and IOS (Apple mobile phones and tablets).
The human–computer interface (HCI) is usually achieved through a graphical user interface (GUI), although it is possible to use a command line interface (CLI) if the user wishes to directly communicate with the computer. A CLI requires a user to type instructions to choose options from menus, open software, and so on. There are often a number of commands that need to be typed; for example, to save or load a file. The user, therefore, has to learn a number of commands (which must be typed exactly with no errors) just to carry out basic operations. Furthermore, it takes time to key in commands every time an operation has to be carried out.
138
457591_05_CI_AS & A_Level_CS_136-158.indd 138
25/04/19 9:27 AM
The advantage of CLI is that the user is in direct communication with the computer and is not restricted to a number of pre-determined options. For example, the following section of CLI imports data from table A into table B. It shows how complex it is just to carry out a straightforward operation.
5
1. SQLPrepare(hStmt, 2. ? (SQLCHAR *) "INSERT INTO tableB SELECT * FROM tableA", 3. ? SQL_NTS):
Table update
A GUI allows the user to interact with a computer (or MP3 player, gaming device, mobile phone, and so on) using pictures or symbols (icons). For example, the whole of the above CLI code could have been replaced by a single icon, like the one on the left. Selecting this icon would execute all of the steps shown in the CLI without the need to type them.
5.1 Operating systems
4. ? SQLExecute(hStmt);
GUIs use various technologies and devices to provide the user interface. One of the first commonly used GUI environments was known as windows, icons, menu and pointing device (WIMP), which was developed for use on personal computers (PCs). Here, a mouse is used to control a cursor and icons are selected to open and run windows. Each window contains an application. Modern computer systems allow several windows to be open at the same time. An example is shown in Figure 5.2.
▲ Figure 5.2 An example of WIMP
A windows manager looks after the interaction between windows, the applications and windowing system (which handles the pointing devices and the cursor’s position). However, smart phones, tablets and many computers now use a post-WIMP interaction where fingers are in contact with the screen, allowing actions such as pinching and rotating which are difficult using a single pointer and device such as a mouse. Also, simply tapping the icon with a finger (or stylus) will launch the application. Developments in touch screen technology mean these flexible HCIs are now readily available.
139
457591_05_CI_AS & A_Level_CS_136-158.indd 139
25/04/19 9:27 AM
5
5.1.2 Operating system tasks memory management
5 System software
file management
operating system
hardware management
security management
process management
▲ Figure 5.3 Operating system tasks
Memory management Memory management, as the name suggests, is the management of a computer’s main memory. This can be broken down into three parts: memory optimisation, memory organisation and memory protection. Memory optimisation Memory optimisation is used to determine how computer memory is allocated and deallocated when a number of applications are running simultaneously. It also determines where they are stored in memory. It must, therefore, keep track of all allocated memory and free memory available for use by applications. To maintain optimisation of memory, it will also swap data to and from the HDD or SSD. Memory organisation Memory organisation determines how much memory is allocated to an application, and how the memory can be split up in the most appropriate or efficient manner. This can be done with the use of » a single (contiguous) allocation, where all of the memory is made available to a single application. This is used by MS-DOS and by embedded systems » partitioned allocation, where the memory is split up into contiguous partitions (or blocks) and memory management then allocates a partition (which can vary in size) to an application » paged memory, which is similar to partitioned allocation, but each partition is of a fixed size. This is used by virtual memory systems » segmented memory, which is different because memory blocks are not contiguous – each segment of memory will be a logical grouping of data (such as the data which may make up an array).
Memory protection Memory protection ensures that two competing applications cannot use the same memory locations at the same time. If this was not done, data could be lost, applications could produce incorrect results, there could be security issues, or the computer may crash.
Memory protection and memory organisation are different aspects of an operating system. An operating system may use a typical type of memory organisation (for example, it may use paging or segmentation) but it is always important that no two applications can occupy the same part of memory. 140
457591_05_CI_AS & A_Level_CS_136-158.indd 140
25/04/19 9:27 AM
Therefore, memory protection must always be a part of any type of memory organisation used. Figure 5.4 shows how different applications can be kept separate from each other. Address
A FENCE defines the boundary between the operating system and the applications; it is not possible for an application to access a memory location which is lower than the FENCE address
Memory operating system
A Boundary location is at address (A+1) A+1
memory allocated to application 1
B Boundary location is at address (B+1) B+1 C
memory allocated to application 2
Boundary location is at address (C+1) C+1 Z
5.1 Operating systems
These boundaries mark the upper limits of the addresses available to each application (addresses start at 0 and end at Z); the boundaries (A + 1, B + 1, C + 1) are often referred to as a FENCE
5
memory allocated to application 3
Upper limit to address value (Z)
▲ Figure 5.4 Memory protection
Security management Security management is another part of a typical operating system. The function of security management is to ensure the integrity, confidentiality and availability of data. This can be achieved by » carrying out operating system updates as and when they become available » ensuring that antivirus software (and other security software) is always upto-date » communicating with, for example, a firewall to check all traffic to and from the computer » making use of privileges to prevent users entering ‘private areas’ on a computer which permits multi-user activity (this is done by setting up user accounts and making use of passwords and user IDs). This helps to ensure the privacy of data » maintaining access rights for all users » offering the ability for the recovery of data (and system restore) when it has been lost or corrupted » helping to prevent illegal intrusion to the computer system (also ensuring the privacy of data).
Note: many of these features are covered in more depth elsewhere in this chapter or in other chapters.
EXTENSION ACTIVITY 5A While working through the remainder of Chapters 5 and 6, find out all of the methods available to ensure the security, privacy and integrity of data and how these link into the operating system security management. It is important to distinguish between what constitutes security, privacy and integrity of data. 141
457591_05_CI_AS & A_Level_CS_136-158.indd 141
25/04/19 9:27 AM
5 System software
5
Process management A process is a program which is being run on a computer. Process management involves the allocation of resources and permits the sharing and exchange of data, thus allowing all processes to be fully synchronised (for example, by the scheduling of resources, resolution of software conflicts, use of queues and so on). This is covered in more depth in Chapter 16. Hardware management Hardware management involves all input and output peripheral devices. The functions of hardware management include » communicating with all input and output devices using device drivers » translating data from a file (defined by the operating system) into a format that the input/output device can understand using device drivers » ensuring each hardware resource has a priority so that it can be used and released as required.
The management of input/output devices is essentially the control and management of queues and buffers. For example, when printing out a document, the printer management » locates and loads the printer driver into memory » sends data to a printer buffer ready for printing » sends data to a printer queue (if the printer is busy or the print job has a low priority) before sending to the printer buffer » sends various control commands to the printer throughout the printing process » receives and handles error messages and interrupts from the printer.
EXTENSION ACTIVITY 5B Write down the tasks carried out by a keyboard manager when a user types text using a word processor. Consider the use of buffers and queues in your answer.
File management The main tasks of file management include
» defining the file naming conventions which can be used (filename.docx, where the extension can be .bat, .htm, .dbf, .txt, .xls, and so on) » performing specific tasks, such as create, open, close, delete, rename, copy, move » maintaining the directory structures » ensuring access control mechanisms are maintained, such as access rights to files, password protection, making files available for editing, locking files, and so on » specifying the logical file storage format (such as FAT or NTFS if Windows is being used), depending on which type of disk formatter is used (see Section 5.1.3) » ensuring memory allocation for a file by reading it from the HDD/SSD and loading it into memory.
142
457591_05_CI_AS & A_Level_CS_136-158.indd 142
25/04/19 9:27 AM
5.1.3 Utility software Computer users are provided with a number of utility programs that are part of the operating system. However, users can also install their own utility software in addition. This software is usually initiated by the user, but some, such as virus checkers, can be set up to constantly run in the background. Utility software offered by most operating systems includes
Hard disk formatter A new hard disk drive needs to be initialised ready for formatting. The operating system needs to know how to store files and where the files will be stored on the hard disks. A disk formatter will organise storage space by assigning it to data blocks (partitions). A disk surface may have a number of partitions (see Chapter 3 for more details regarding the organisation of data on hard disks). Note that partitions are contiguous blocks of data.
5.1 Operating systems
» hard disk formatter » virus checker » defragmentation software » disk contents analysis/repair software » file compression » back-up software.
5
Once the partitions have been created, they must be formatted. This is usually done by writing files which will hold directory data and tables of contents (TOC) at the beginning of each partition. This allows the operating system to recognise a file and know where to find it on the disk surface. Different operating systems will use different filing systems; Windows, for example, uses new technology filing system (NTFS). When carrying out full formatting using NTFS, all disk sectors are filled with zeros; these zeros are read back, thus testing the sector, but any data already stored there will be lost. So, it is important to remember that reformatting an HDD which has already been used will result in loss of data during the formatting procedure. Disk formatters also have checking tools, which are non-destructive tests that can be carried out on each sector. If any bad sector errors are discovered, the sectors will be flagged as ‘bad’ and the file tracking records will be reorganised– this is done by replacing the bad sectors with new unused sectors, effectively repairing the faulty disk. A damaged file will now contain an ‘empty’ sector, which allows the file to be read but it will be corrupted since the bad sector will have contained important (and now lost) data. It would, therefore, be prudent to delete the damaged file leaving the rest of the HDD effectively repaired. Bad sectors can be categorised as hard or soft. There are a number of ways that they can be produced, as shown in Table 5.1. Hard bad sectors (difficult to repair)
Soft bad sectors
n
n
caused by manufacturing errors damage to disk surface caused by allowing the readwrite head to touch the disk surface (for example, by moving HDD without first parking the read-write head) n system crash which could lead to damage to the disk surface(s) n
sudden loss of power leading to data corruption in some of the sectors n effect of static electricity leading to corruption of data in some of the sectors on the hard disk surfaces
▲ Table 5.1 Hard and soft bad sectors 143
457591_05_CI_AS & A_Level_CS_136-158.indd 143
25/04/19 9:27 AM
5 System software
5
Virus checkers Any computer (including mobile phones and tablets) can be subject to a virus attack (see Chapter 6). There are many ways to help prevent viruses, such as being careful when downloading material from the internet, not opening files or links in emails from unknown senders, and by only using certified software. However, virus checkers – which are offered by operating systems – still provide the best defence against malware, as long as they are kept up to date and constantly run in the background. Running antivirus software in the background on a computer will constantly check for virus attacks. Although various types of antivirus software work in different ways, they have some common features. They » check software or files before they are run or loaded on a computer » compare possible viruses against a database of known viruses » carry out heuristic checking – this is the checking of software for types of behaviour that could indicate a possible virus, which is useful if software is infected by a virus not yet on the database » put files or programs which may be infected into quarantine, to – automatically delete the virus, or – allow the user to decide whether to delete the file (it is possible that the user knows that the file or program is not infected by a virus – this is known as a false positive and is one of the drawbacks of antivirus software).
Antivirus software needs to be kept up to date since new viruses are constantly being discovered. Full system checks need to be carried out once a week, for example, since some viruses lie dormant and would only be picked up by this full system scan.
Defragmentation software As an HDD becomes full, blocks used for files will become scattered all over the disk surface (in potentially different sectors and tracks as well as different surfaces). This happens as files are deleted, partially-deleted, extended and so on. The consequence is slower data access time: the HDD read-write head requires several movements just to find and retrieve the data making up the required file. It would be advantageous if files could be stored in contiguous sectors, considerably reducing HDD head movements. Note that, due to their different operation when accessing data, this is less of a problem with SSDs. Consider the following example using a disk with 12 sectors per surface. We have three files (1, 2 and 3) stored on track 8 of the disk surface. sectors:
track 8:
1 File 1
2
3
4 File 2
5
6
7
8
9
10
11
File 3
▲ Figure 5.5
File 2 is deleted by the user and file 1 has data added to it. However, the file 2 sectors which become vacant are not filled up straight away by new file 1 data since this would require ‘too much effort’ for the HDD resources.
144
457591_05_CI_AS & A_Level_CS_136-158.indd 144
25/04/19 9:27 AM
We get the following. track 8:
File 1
File 3
File 1
5
▲ Figure 5.6
File 1 has been extended to write data in sectors 10 and 11.
track 8:
File 1
track 11:
File 3
File 3
File 1
▲ Figure 5.7
If this continues, the files just become more and more scattered throughout the disk surfaces. It is possible for sectors 4, 5 and 6 (on track 8) to eventually become used if the disk starts to fill up and it has to use up whatever space is available. A disk defragmenter will rearrange the blocks of data to store files in contiguous sectors wherever possible; however, if the disk drive is almost full, defragmentation may not work. Assuming we can carry out defragmentation, then track 8 now becomes: track 8:
File 1
5.1 Operating systems
Now, suppose file 3 is extended with the equivalent of 3.25 blocks of data. This requires filling up sector 9 and then moving to some empty sectors to write the remainder of the data – the next free sectors are on track 11.
File 3
▲ Figure 5.8
This allows for much faster data access and retrieval since the HDD now requires fewer read-write head movements to access and read files 1 and 3. Some defragmenters also carry out clean up operations. Data blocks can become damaged after several read/write operations (this is different to bad sectors). If this happens, they are flagged as ‘unusable’ and any subsequent write operation will avoid writing data to data blocks which have become affected.
Disk content analysis/repair software The concept of disk repair software was discussed in the above section. Disk content analysis software is used to check disk drives for empty space and disk usage by reviewing files and file folders. This can lead to optimal use of disk space by the removal of unwanted files and downloads (such as the deletion of auto saving files, cookies, download files, and so on). Disk compression and file compression File compression is essential to save storage space and make it quicker to download/upload files and quicker to send files via email. It was discussed in Chapter 1. Disk compression is much less common these days due to the vast size of HDDs (often more than 2 TB). The disk compression utility compresses data before writing it to hard disk (and decompresses it again when reading this data). It is a high priority utility and will essentially override all other operating system routines – this is essential because all applications need to have access to the HDD. It is important not to uninstall disk compression software since this would render any previously saved data to be unreadable. 145
457591_05_CI_AS & A_Level_CS_136-158.indd 145
25/04/19 9:27 AM
5
Back-up software While it is sensible to take manual back-ups using, for example, a memory stick or portable HDD, it is also good practice to use the operating system back-up utility. This utility will » allow a schedule for backing up files to be made » only carry out a back-up procedure if there have been any changes made to a file.
5 System software
For total security, there should be three versions of a file: 1 The current (working) version stored on the internal HDD. 2 A locally backed up copy of the file (stored on a portable SSD, for example). 3 A remote back-up version stored well away from the computer (using cloud storage, for example). Windows environment offers the following facilities using the back-up utility: » The ability to restore data, files or the computer from the back-up (useful if there has been a problem and files have been lost and need to be recovered). » The ability to create a restore point (this restores a computer to its state at some point in the past; this can be very useful if a very important file has been deleted and cannot be recovered by any of the other utilities). » Options of where to save back-up files; this can be set up from the utility to ensure files are automatically backed up to a chosen device.
Windows uses File History, which takes snapshots of files and stores them on an external HDD at regular intervals. Over a period of time, File History builds up a vast library of past versions of files – this allows a user to choose which version of the file they want to use. File History defaults to backing up every hour and retains past versions of files forever unless the user changes the settings. Mac OS offers the Time Machine back-up utility. This erases the contents of a selected drive and replaces them with the contents from the back-up. To use this facility it is necessary to have an external HDD or SSD (connected via USB port) and ensure that the Time Machine utility is installed and activated on the selected computer. Time machine will automatically » back up every hour » keep daily back-ups for the past month, and » keep weekly back-ups for all the previous months.
Note that once the back-up HDD or SSD is almost full, the oldest back-ups are deleted and replaced with the newest back-up data. Figure 5.9 shows the Time Machine message:
▲ Figure 5.9 Screen shot of Time Machine message 146
457591_05_CI_AS & A_Level_CS_136-158.indd 146
25/04/19 9:27 AM
5.1.4 Program libraries Program libraries are used
When software routines are written (such as a sort routine), they are frequently saved in a program library for future use by other programmers. A program stored in a program library is known as a library program. We also have the term library routines to describe subroutines which could be used in another piece of software under development. Suppose we are writing a game for children with animated graphics (of a friendly panda) using music routines and some scoreboard graphics.
5 5.1 Operating systems
» when software is under development and the programmer can utilise pre-written subroutines in their own programs, thus saving considerable development time » to help a software developer who wishes to use dynamic link library (DLL) subroutines in their own program, so these subroutines must be available at run time.
Well done Freddie!! You got 6 right
▲ Figure 5.10
This game could be developed using existing routines from a library. friendly panda animation routines new game under development
children’s music routines final scoring graphics
▲ Figure 5.11
147
457591_05_CI_AS & A_Level_CS_136-158.indd 147
25/04/19 9:27 AM
Developing software in this way
5 5 System software
» removes the need to rewrite the many routines every single time (thus saving considerable time and cost) » leads to modular programming, which means several programmers can be working on the same piece of software at the same time » allows continuity with other games that may form part of a whole range (in education, where there may be a whole suite of programs, for example) » allows the maintenance of a ‘corporate image’ in all the software being developed by a particular company » saves considerable development time having to test each routine, since the routines are all fully tested in other software and should be error-free.
All operating systems have two program libraries containing library programs and library routines: static and dynamic. In static libraries, software being developed is linked to executable code in the library at the time of compilation. So the library routines would be embedded directly into the new program code. In dynamic libraries, software being developed is not linked to the library routines until actual run time (these are known as dynamic link library files or DLL). These library routines would be stand-alone files only being accessed as required by the new program – the routines will be available to several applications at the same time. When using DLL, since the library routines are not loaded into RAM until required, memory is saved, and software runs faster. For example, suppose we are writing new software which allows access to a printer as part of its specification. The main program will be developed and compiled. Once the object code is run, it will only access (and load up) the printer routine from DLL when required by the user of the program. The main program will only contain a link to the printer library routine and will not contain any of the actual printer routine coding in the main body. Table 5.2 summarises the pros and cons of using DLL files. Pros of using DLL files
Cons of using DLL files
the executable code of the main program is much smaller since DLL files are only loaded into memory at run time
the executable code is not self-contained, therefore all DLL files need to be available at run time otherwise error messages (such as missing .dll error) will be generated and the software may even crash
it is possible to make changes to DLL files independently of the main program, consequently if any changes are made to the DLL files it will not be necessary to recompile the main program
any DLL linking software in the main program needs to be available at run time to allow links with DLL files to be made
DLL files can be made available to a number of applications at the same time
if any of the DLL files have been changed (either intentionally or through corruption) this could lead to the main program giving unexpected results or even crashing
all of the above save memory and also save execution time malicious changes to DLL files could be due to the result of malware, thus presenting a risk to the main program following the linking process
▲ Table 5.2 Pros and cons of using DLL files
148
457591_05_CI_AS & A_Level_CS_136-158.indd 148
25/04/19 9:27 AM
ACTIVITY 5A 3 A computer user has a number of important issues, listed below. For each issue, name a utility which could help solve it. Give a reason for each choice. a) The user wants to send a number of very large attachments by email, but the recipient cannot accept attachments greater than 20 MB. b) The user has accidentally deleted files in the past. It is essential that this cannot happen in the future. c) The user has had their computer for a number a years. The time to access and retrieve data from the hard disk drive is increasing. d) Last week, the user clicked on a link in an email from a friend, since then the user’s computer is running slowly, files are being lost, and they are receiving odd messages. e) Some of the files on the user’s HDD have corrupted and will not open and this is affecting the performance of the HDD.
5 5.2 Language translators
1 a) i) Explain why a computer needs an operating system. ii) Name two management tasks carried out by the operating system. b) A new program is to be written in a high level language. The developer has decided to use DLL files in the design of the new program. i) Explain what is meant by a DLL file. How does this differ from a static library routine? ii) Describe two potential drawbacks of using DLL files in the new program. 2 A company produces glossy geography magazines. Each magazine is produced using a network of computers where thousands of photographs and drawings need to be stored. The computers also have an external link to the internet. Name, and describe the function of, three utility programs the company would use on all its computers.
5.2 Language translators WHAT YOU SHOULD ALREADY KNOW Try these two questions before you read the second part of this chapter: 1 a) Name two types of language translator. b) Identify a method, other than using a translator, of executing a program written in a high-level language.
2 Most modern language translators offer an Integrated Development Environment (IDE) for program development. a) Which IDE are you using? b) Describe five features offered by the IDE you use. c) Which feature do you find most useful? Why is it useful to you?
Key terms Translator – the systems software used to translate a source program written in any language other than machine code.
Interpreter – a computer program that analyses and executes a program written in a high-level language line by line.
Compiler – a computer program that translates a source program written in a high-level language to machine code or p-code, object code.
Prettyprinting – the practice of displaying or printing well set out and formatted source code, making it easier to read and understand.
149
457591_05_CI_AS & A_Level_CS_136-158.indd 149
25/04/19 9:27 AM
5
Integrated development environment (IDE) – a suite of programs used to write and test a computer program written in a high-level programming language. Syntax error – an error in the grammar of a source program. Logic error – an error in the logic of a program.
5 System software
Debugging – the process of finding logic errors in a computer program by running or tracing the program.
Single stepping – the practice of running a program one line/instruction at a time. Breakpoint – a deliberate pause in the execution of a program during testing so that the contents of variables, registers, and so on can be inspected to aid debugging. Report window – a separate window in the run-time environment of the IDE that shows the contents of variables during the execution of a program.
5.2.1 Translation and execution of programs Instructions in a program can only be executed when written in machine code and loaded into the main memory of a computer. Programming instructions written in any programming language other than machine code must be translated before they can be used. The systems software used to translate a source program written in any language other than machine code are translators. There are three types of translator available, each translator performs a different role.
Assemblers Programs written in assembly language are translated into machine code by an assembler program. Assemblers either store the program directly in main memory, ready for execution, as it is translated, or they store the translated program on a storage medium to be used later. If stored for later use, then a loader program is also needed to load the stored translated program into main memory before it can be executed. The stored translated program can be executed many times without being re-translated. Every different type of computer/chip has its own machine code and assembly language. For example, MASM is an assembler that is used for the X86 family of chips, while PIC and GENIE are used for microcontrollers. Assembly language programs are machine dependent; they are not portable from one type of computer/chip to another. Here is a short sample PIC assembly program: movlw B’00000000’ tris
EXTENSION ACTIVITY 5C
movlw B’00000011’ movwf PORTB
stop: goto
stop
Assembly language programs are often written for tasks that need to be speedily executed, for example, parts of an operating system, central heating system or controlling a robot.
Find out what task this very short sample PIC assembly program is performing.
PORTB
150
457591_05_CI_AS & A_Level_CS_136-158.indd 150
25/04/19 9:27 AM
Compilers and interpreters Programs written in a high-level language can be either translated into machine code by a compiler program, or directly executed line-by-line using an interpreter program.
EXTENSION ACTIVITY 5D Find out about three more highlevel programming languages that are being used today.
Source program written in
With an interpreter, no translated program is generated in main memory or stored for later use. Every line in a program is interpreted then executed each time the program is run. High-level language programs are machine independent, portable and can be run on any type of computer/chip, provided there is a compiler or interpreter available. For example, Java, Python and Visual Basic® (VB) are high-level languages often used for teaching programming.
5.2 Language translators
Compilers usually store the translated program (object program) on a storage medium ready to be executed later. A loader program is needed to load the stored translated program into main memory before it can be executed. The stored translated program can be executed many times without being retranslated. The program will only need to be retranslated when changes are made to the source code.
5
The similarities and differences between assemblers, compilers and interpreters are shown in Table 5.3. Assembler
Compiler
Interpreter
assembly language
high-level language
high-level language
Machine dependent
yes
no
no
Object program generated
yes, stored on disk or in main memory
yes, stored on disk or in main memory
no, instructions are executed under the control of the interpreter
Each line of the source program generates
one machine code instruction, one to one translation
many machine code instructions, instruction explosion
many machine code instructions, instruction explosion
▲ Table 5.3 Similarities and differences between assemblers, compilers and interpreters
5.2.2 Pros and cons of compiling or interpreting a program Both compilers and interpreters are used for programs written in high-level languages. Some integrated development environments (IDEs) have both available for programmers, since interpreters are most useful in the early stages of development and compilers produce a stand-alone program that can be executed many times without needing the compiler. Table 5.4 shows the pros (in the blue cells) and cons (in the white cells) of compilers and interpreters. Compiler
Interpreter
The end user only needs the executable code, therefore, the The end user will need to purchase a compiler or an end user benefits as there is no need to purchase a compiler interpreter to translate the source code before it is used. to translate the program before it is used. The developer keeps hold of the source code, so it cannot be altered or extended by the end user, therefore, the developer benefits as they can charge for upgrades and alterations.
The developer relinquishes control of the source code, making it more difficult to charge for upgrades and alterations. Since end users can view the source code, they could potentially use the developer’s intellectual property.
➔ 151
457591_05_CI_AS & A_Level_CS_136-158.indd 151
25/04/19 9:27 AM
5 System software
5
Compiler
Interpreter
Compiled programs take a shorter time to execute as translation has already been completed and the machine code generated may have been optimised by the compiler.
An interpreted program can take longer to execute than the same program when compiled, since each line of the source code needs to be translated before it is executed every time the program is run.
Compiled programs have no syntax or semantic errors.
Interpreted programs may still contain syntax or semantic errors if any part of the program has not been fully tested, these errors will need to be debugged.
The source program can be translated on one type of computer then executed on another type of computer.
Interpreted programs cannot be interpreted on one type of computer and run on another type of computer.
A compiler finds all errors in a program. One error detected can mean that the compiler finds other dependent errors later on in the program that will not be errors when the first error is corrected. Therefore, the number of errors found may be more than the actual number of errors.
It is easier to develop and debug a program using an interpreter as errors can be corrected on each line and the program restarted from that place, enabling the programmer to easily learn from any errors.
Untested programs with errors may cause the computer to crash.
Untested programs should not be able to cause the computer to crash.
The developer needs to write special routines in order to view partial results during development, making it more difficult to assess the quality of particular sections of code.
Partial results can be viewed during development, enabling the developer to make informed decisions about a section of code, for example whether to continue, modify, or scrap and start again.
End users do not have access to the source code and the run-time libraries, meaning they are unable to make modifications and are reliant on the developer for updates and alterations.
If an interpreted program is purchased, end users have all the source code and the run-time libraries, enabling the program to be modified as required without further purchase.
▲ Table 5.4 Pros (blue cells) and cons (white cells) of compilers and interpreters.
5.2.3 Partial compiling and interpreting In order to achieve shorter execution times, many high-level languages programs use a system that is partially compilation and partially interpretation. The source code is checked and translated by a compiler into object code. The compiled object code is a low-level machine independent code, called intermediate code, p-code or bytecode. To execute the program, the object code can be interpreted by an interpreter or compiled using a compiler. For example, Java and Python programs can be translated by a compiler into a set of instructions for a virtual machine. These instructions, called bytecode, are then interpreted by an interpreter. Below are examples of Java and Python intermediate code (bytecode): Source code: public class HelloWorld { public static void main(String[] args) { System.out.println("Hello World"); }
} 152
457591_05_CI_AS & A_Level_CS_136-158.indd 152
25/04/19 9:27 AM
5 5.2 Language translators
Bytecode: Compiled from "HelloWorld.java" public class HelloWorld extends java.lang.Object{ public HelloWorld(); Code: 0: aload _ 0 1: invokespecial #1; //Method java/lang/ Object.””:()V 4: return public static void main(java.lang.String[]); Code: 0: getstatic #2; //Field java/lang/System. out:Ljava/io/PrintStream; #3; //String Hello World 3: ldc 5: invokevirtual #4; //Method java/io/PrintStream. println:(Ljava/lang/String;)V 8: return Source code: print ("Hello World") Bytecode: 1 0 LOAD _ NAME
0 (print)
2 LOAD _ CONST
0 ('Hello World')
4 CALL _ FUNCTION 1 (1 positional, 0 keyword pair)
6 RETURN _ VALUE
EXTENSION ACTIVITY 5E Visual Basic also has an interpreter for bytecode. Find an example of bytecode for Visual Basic. See if you can find the bytecode for displaying ‘Hello World’ on the screen as in the Python example above.
5.2.4 Integrated development environment (IDE) An integrated development environment (IDE) is used by programmers to aid the writing and development of programs. There are many different IDEs available; some just support one programming language, others can be used for several different programming languages. NetBeans®, PyCharm®, Visual Studio® and SharpDevelop are all IDEs currently in use.
EXTENSION ACTIVITY 5F In small groups investigate different IDEs. See how many different features are available for your group’s IDE and identify which programming language(s) are supported. Compare the features of the IDE investigated by your group with the IDEs investigated by other groups in the class. 153
457591_05_CI_AS & A_Level_CS_136-158.indd 153
4/30/19 7:51 AM
5 System software
5
IDEs usually have » » » »
a source code editor a compiler, an interpreter, or both a run-time environment with a debugger an auto-documenter.
Source code editor A source code editor allows a program to be written and edited without the need to use a separate text editor. The use of an integrated source code editor speeds up the development process, as editing can be done without changing to a different piece of software each time the program needs correcting or adding to. Most source code editors colour code the words in the program and layout the program in a meaningful way (prettyprinting). Some source code editors also offer context sensitive prompts with text completion for variable names and reserved words, and provide dynamic syntax checking. Figures 5.12 and 5.13 show these features in the PyCharm source code editor.
colour coded words
context sensitive prompt offering text completion
▲ Figure 5.12 PyCharm IDE showing source code editor
Here, string values are shown coloured green and integer values are shown coloured blue.
▲ Figure 5.13 PyCharm IDE showing dynamic syntax checking
154
457591_05_CI_AS & A_Level_CS_136-158.indd 154
25/04/19 9:27 AM
Dynamic syntax checking finds possible syntax errors as the program code is being typed in to the source code editor and alerts the programmer at the time, before the source code is interpreted. Many errors can therefore be found and corrected during program writing and editing before the program is run. Logic errors can only be found when the program is run.
5
For larger programs that have more than one code block, some code blocks can be collapsed to a single line in the editor allowing the programmer to just see the code blocks that are currently being developed. 5.2 Language translators
Compilers and interpreters Most IDEs usually provide a compiler and/or an interpreter to run the program. The interpreter is often used for developing the program and the compiler to produce the final version of the object code.
source program
run-time environment
▲ Figure 5.14 PyCharm IDE showing both program code and program run
With PyCharm there can be more than one interpreter available for different versions of the Python language. The program results are shown using the run-time environment provided.
A run-time environment with a debugger A debugger is a program that runs the program under development and aids the process of debugging. It allows the programmer to single step through the program a line at a time (single stepping) or to set a breakpoint to stop the execution of the program at a certain point in the source code. A report window then shows the contents of the variables and expressions evaluated at that point in the program. This allows the programmer to see if there are any logic errors in the program and check that the program works as intended.
single step
type and contents of the variables
155
457591_05_CI_AS & A_Level_CS_136-158.indd 155
25/04/19 9:27 AM
5 5 System software
single step
type and contents of the variables
▲ Figure 5.15 PyCharm IDE showing the report window after line 2 (page 155) and after line 4 (above)
Each variable used is shown in the report window together with the type and the contents of the variable at that point in the program. The top variable shown is the last one that was used. Answers to calculations and other expressions can also be shown.
▲ Figure 5.16 PyCharm IDE showing the report window with the answer to an expression
156
457591_05_CI_AS & A_Level_CS_136-158.indd 156
25/04/19 9:27 AM
Auto-documenter Most IDEs usually provide an auto-documenter to explain the function and purpose of programming code.
5 5.2 Language translators
▲ Figure 5.17 PyCharm IDE showing the quick documentation window for print
ACTIVITY 5B 1 a) i) Describe the difference between a compiler and an assembler. ii) Describe the difference between a compiler and an interpreter. b) State two benefits and two drawbacks of using an interpreter. 2 A new program is to be written in a high-level language. The programmer has decided to use an IDE to develop the new program. a) Explain what is meant by an IDE. b) Describe three features of an IDE.
End of chapter questions
1 A programmer is writing a program that includes code from a program library. a) Describe two benefits to the programmer of using one or more library routines.[4] b) The programmer decides to use a Dynamic Link Library (DLL) file. i) Describe two benefits of using DLL files. [4] ii) State one drawback of using DLL files. [2] Cambridge International AS & A Level Computer Science 9608 Paper 12 Q8 November 2016 2 a) The operating system contains code for performing various management tasks. The appropriate code is run when the user performs actions. Copy the diagram below and connect each OS management task to the appropriate user action. [3]
➔
157
457591_05_CI_AS & A_Level_CS_136-158.indd 157
25/04/19 9:27 AM
action
main memory management
The user moves the mouse on the desktop
input/output management
The user closes the spreadsheet program
secondary storage management
The user selects the SAVE command to save their spreadsheet
human–computer interface management
The user selects the PRINT command to output their spreadsheet
b) A user has the following issues with the use of his PC. State the utility software which should provide a solution. i) The hard disk stores a large number of video files. The computer [1] frequently runs out of storage space. ii) The user is unable to find an important document. He thinks it [1] was deleted in error some weeks ago. This must not happen again. iii) The operating system reports ‘bad sector’ errors. [1] iv) There have been some unexplained images and advertisements appearing on the screen. The user suspects it is malware. [1] Cambridge International AS & A Level Computer Science 9608 Paper 11 Q6 June 2017 3 File History and Time Machine are examples of back-up utilities offered as part of two different operating systems. [2] a) Explain why it is important to back up files on a computer. b) One of the features offered by both utilities is the possibility of ‘turning back the internal computer clock’. Explain why this is an important feature and give two occasions when a user may wish to use this feature. [4] c) By using diagrams and written explanation, describe how defragmentation software works. [4] 4 Assemblers, compilers and interpreters are all used to translate programs. [6] Discuss the different roles played by each translator. 5 State four features of an IDE that are helpful when coding a program. [4]
5 System software
5
OS management task
158
457591_05_CI_AS & A_Level_CS_136-158.indd 158
25/04/19 9:27 AM
6
Security, privacy and data integrity
★ ★ ★
★ ★ ★ ★
the terms security, privacy and integrity of data the need for security of data and security of computer systems security measures to protect computer systems such as user accounts, passwords, digital signatures, firewalls, antivirus and anti-spyware software and encryption security threats such as viruses and spyware, hacking, phishing and pharming methods used to reduce security risks such as encryption and access rights the use of validation to protect data integrity the use of verification during data entry and data transfer to reduce or eliminate errors.
6.1 Data security
In this chapter, you will learn about
6.1 Data security WHAT YOU SHOULD ALREADY KNOW Try these five questions before you read the first part of this chapter. 1 a) What is meant by hacking? b) Is hacking always an illegal act? Justify your answer. 2 Contactless credit cards and debit cards are regarded by some as a security risk. Discuss the advantages and disadvantages of using contactless cards with particular reference to data security. 3 What are the main differences between cracking and hacking?
4 a) What are pop-ups when visiting a website? Are they a security risk? b) What are cookies? Do cookies pose a security threat? c) Describe: i) session cookies ii) permanent cookies iii) third party cookies. 5 Why must the correct procedures be carried out when removing a memory stick from a computer?
Key terms Data privacy – the privacy of personal information, or other information stored on a computer, that should not be accessed by unauthorised parties.
User account – an agreement that allows an individual to use a computer or network server, often requiring a user name and password.
Data protection laws – laws which govern how data should be kept private and secure.
Authentication – a way of proving somebody or something is who or what they claim to be.
Data security – methods taken to prevent unauthorised access to data and to recover data if lost or corrupted.
Access rights (data security) – use of access levels to ensure only authorised users can gain access to certain data. 159
457591_06_CI_AS & A_Level_CS_159-177.indd 159
25/04/19 9:35 AM
6
Malware – malicious software that seeks to damage or gain unauthorised access to a computer system.
example, deletion of files or use of private data to the hacker’s advantage).
Firewall – software or hardware that sits between a computer and external network that monitors and filters all incoming and outgoing activities.
Ethical hacking – hacking used to test the security and vulnerability of a computer system. The hacking is carried out with the permission of the computer system owner, for example, to help a company identify risks associated with malicious hacking of their computer systems.
6 Security, privacy and data integrity
Anti-spyware software – software that detects and removes spyware programs installed illegally on a user’s computer system. Encryption – the use of encryption keys to make data meaningless without the correct decryption key. Biometrics – use of unique human characteristics to identify a user (such as fingerprints or face recognition). Hacking – illegal access to a computer system without the owner’s permission. Malicious hacking – hacking done with the sole intent of causing harm to a computer system or user (for
Phishing – legitimate-looking emails designed to trick a recipient into giving their personal data to the sender of the email. Pharming – redirecting a user to a fake website in order to illegally obtain personal data about the user. DNS cache poisoning – altering IP addresses on a DNS server by a ‘pharmer’ or hacker with the intention of redirecting a user to their fake website.
6.1.1 Data privacy Data stored about a person or an organisation must remain private and unauthorised access to the data must be prevented – data privacy is required. This is achieved partly by data protection laws. These laws vary from country to country, but all follow the same eight guiding principles. 1 Data must be fairly and lawfully processed. 2 Data can only be processed for the stated purpose. 3 Data must be adequate, relevant and not excessive. 4 Data must be accurate. 5 Data must not be kept longer than necessary. 6 Data must be processed in accordance with the data subject’s rights. 7 Data must be kept secure. 8 Data must not be transferred to another country unless that country also has adequate protection.
Data protection laws usually cover organisations rather than private individuals. Such laws are no guarantee of privacy, but the legal threat of fines or jail sentences deters most people.
6.1.2 Preventing data loss and restricting data access Data security refers to the methods used to prevent unauthorised access to data, as well as to the data recovery methods if it is lost.
User accounts User accounts are used to authenticate a user (prove that a user is who they say they are). User accounts are used on both standalone and networked computers in case the computer can be accessed by a number of people. This is often done by a screen prompt asking for a username and password:
160
457591_06_CI_AS & A_Level_CS_159-177.indd 160
25/04/19 9:35 AM
6 User login Need an account? Sign Up
6.1 Data security
username
password
keep me logged in Sign In
Forgot your password? Click here
▲ Figure 6.1 A login screen
EXTENSION ACTIVITY 6A An airport uses a computer system to control security, flight bookings, passenger lists, administration and customer services. Describe how it is possible to ensure the safety of the data on the system so that senior staff can see all data, while customers can only access flight times (arrivals and departures) and duty free offers.
User accounts control access rights. This often involves levels of access. For example, in a hospital it would not be appropriate for a cleaner to have access to data about one of the patients. However, a consultant would need such access. Therefore, most systems have a hierarchy of access levels depending on a person’s level of security. This could be achieved by username and password with each username (account) linked to the appropriate level of access.
Use of passwords Passwords are used to restrict access to data or systems. They should be hard to crack and changed frequently to retain security. Passwords can also take the form of biometrics (such as on a mobile phone, as discussed later). Passwords are also used, for example, when » accessing email accounts » carrying out online banking or shopping » accessing social networking sites.
It is important that passwords are protected. Some ways of doing this are to » run anti-spyware software to make sure your passwords are not being relayed to whoever put the spyware on your computer » regularly change passwords in case they have been seen by someone else, illegally or accidentally » make sure passwords are difficult to crack or guess (for example, do not use your date of birth or pet’s name).
Passwords are grouped as either strong (hard to crack or guess) or weak (relatively easy to crack or guess). Strong passwords should contain » at least one capital letter » at least one numerical value » at least one other keyboard character (such as @, *, &) 161
457591_06_CI_AS & A_Level_CS_159-177.indd 161
25/04/19 9:35 AM
6
Example of a strong password: Sy12@#TT90kj=0 Example of a weak password: GREEN
EXTENSION ACTIVITY 6B
6 Security, privacy and data integrity
Which of the following are weak passwords and which are strong passwords? Explain your decision in each case. a) 25-May-2000 b) Pas5word c) ChapTer@06 d) AbC*N55! e) 12345X
Digital signatures Digital signatures protect data by providing a way of identifying the sender of, for example, an email. These are covered in more depth in Chapter 17. Use of firewalls A firewall can be software or hardware. It sits between the user’s computer and an external network (such as the internet) and filters information in and out of the computer. This allows the user to decide to allow communication with an external source and warns a user that an external source is trying to access their computer. Firewalls are the primary defence to any computer system to protect from hacking, malware (viruses and spyware), phishing and pharming. user’s computer
firewall (software or hardware)
internet
▲ Figure 6.2 Firewall
The tasks carried out by a firewall include » examining the traffic between the user’s computer (or internal network) and a public network (such as the internet) » checking whether incoming or outgoing data meets a given set of criteria » blocking the traffic if the data fails to meet the criteria, and giving the user (or network manager) a warning that there may be a security issue » logging all incoming and outgoing traffic to allow later interrogation by the user (or network manager) » preventing access to certain undesirable sites – the firewall can keep a list of all undesirable IP addresses » helping to prevent viruses or hackers entering the user’s computer (or internal network) » warning the user if some software on their system is trying to access an external data source (such as an automatic software upgrade). The user is given the option of allowing it to go ahead or request that such access is denied.
The firewall can be a hardware interface which is located somewhere between the computer (or internal network external link) and the internet connection. In
162
457591_06_CI_AS & A_Level_CS_159-177.indd 162
25/04/19 9:35 AM
these cases, it is often referred to as a gateway. Alternatively, the firewall can be software installed on a computer, sometimes as part of the operating system. However, sometimes the firewall cannot prevent potential harmful traffic. It cannot
These issues require management and/or personal control to ensure the firewall can work effectively.
Antivirus software Running antivirus software in the background on a computer will constantly check for virus attacks. Although different types of antivirus software work in different ways, they all
6.1 Data security
» prevent individuals, on internal networks, using their own modems to by-pass the firewall » control employee misconduct or carelessness (for example, control of passwords or user accounts) » prevent users on stand-alone computers from disabling the firewall.
6
» check software or files before they are run or loaded on a computer » compare possible viruses against a database of known viruses » carry out heuristic checking (check software for behaviour that could indicate a virus, which is useful if software is infected by a virus not yet on the database) » quarantine files or programs which are possibly infected and – allow the virus to be automatically deleted, or – allow the user to make the decision about deletion (it is possible that the user knows that the file or program is not infected by a virus – this is known as a false positive and is one of the drawbacks of antivirus software).
Antivirus software needs to be kept up to date since new viruses are constantly being discovered. Full system checks need to be carried out regularly (once a week, for example), since some viruses lie dormant and would only be picked up by this full system scan.
Anti-spyware software Anti-spyware software detects and removes spyware programs installed illegally on a user’s computer system. The software is either based on rules (it looks for typical features associated with spyware) or based on known file structures which can identify common spyware programs. Encryption If data on a computer has been accessed illegally (by a hacker, for example) it is possible to encrypt the data, making it virtually impossible to understand without encryption keys to decode it. This cannot stop a hacker from deleting the files, but it will stop them using the data for themselves. This is covered in more depth in Chapter 17. Biometrics In an attempt to stay one step ahead of hackers and malware writers, many modern computer devices use biometrics as part of the password system. Biometrics rely on the unique characteristics of human beings. Examples include fingerprint scans, retina scans (pattern of blood capillary structure), face recognition and voice recognition. 163
457591_06_CI_AS & A_Level_CS_159-177.indd 163
25/04/19 9:35 AM
Fingerprint scans Images of fingerprints are compared against previously scanned fingerprints stored in a database; if they match then access is allowed; the system compares patterns of ‘ridges’ and ‘valleys’ which are fairly unique (accuracy isabout 1 in 500).
6 Security, privacy and data integrity
6
Retina scans Retina scans use infra-red to scan the unique pattern of blood vessels in the retina (at the back of the eye). It requires a person to stay still for 10 to 15seconds while the scan takes place; it is very secure since nobody has yet found a way to duplicate the blood vessels patterns’ (accuracy is about 1 in 10million). ▲ Figure 6.3 Fingerprint
▲ Figure 6.4 Retina scan
Mobile phones use biometrics to identify if the phone user is the owner.
6.1.3 Risks to the security of stored data Hacking You will see the term hacking used throughout this textbook. There are two types of hacking: malicious and ethical. Malicious hacking is the illegal access to a computer system without the user’s permission or knowledge. It is usually employed with the intention of deleting, altering or corrupting files, or to gain personal details such as bank account details. Strong passwords, firewalls and software which can detect illegal activity all guard against hacking. Ethical hacking is authorised by companies to check their security measures and how robust their computer systems are to resist hacking attacks. It is legal, and is done with a company’s permission with a fee paid to the ethical hacker.
Malware Malware is one of the biggest risks to the integrity and security of data on a computer system. Many software applications sold as antivirus are capable of identifying and removing most of the forms of malware described below. Viruses Programs or program code that can replicate and/or copy themselves with the intention of deleting or corrupting files or causing the computer to malfunction. 164
457591_06_CI_AS & A_Level_CS_159-177.indd 164
25/04/19 9:35 AM
They need an active host program on the target computer or an operating system that has already been infected before they can run. Worms A type of stand-alone virus that can replicate themselves with the intention of spreading to other computers; they often use networks to search out computers with weak security.
Trojan horses Malicious programs often disguised as legitimate software. They replace all or part of the legitimate software with the intent of carrying out some harm to the user’s computer system.
6.1 Data security
Logic bombs Code embedded in a program on a computer. When certain conditions are met (such as a specific date) they are activated to carry out tasks such as deleting files or sending data to a hacker.
6
Bots (internet robots) Not always harmful and can be used, for example, to search automatically for an item on the internet. However, they can cause harm by taking control over a computer system and launching attacks. Spyware Software that gathers information by monitoring, for example, key presses on the user’s keyboard. The information is then sent back to the person who sent the software – sometimes referred to as key logging software.
Phishing Phishing is when someone sends legitimate-looking emails to users. They may contain links or attachments which, when clicked, take the user to a fake website, or they may trick the user into responding with personal data such as bank account details or credit card numbers. The email often appears to come from a trusted source such as a bank or service provider. The key is that the recipient has to carry out a task (click a link, for example) before the phishing scam causes harm. There are numerous ways to help prevent phishing attacks: » Users need to be aware of new phishing scams. Those people in industry or commerce should undergo frequent security awareness training to become aware of how to identify phishing (and pharming) scams. » Do not click on links unless certain that it is safe to do so; fake emails can often be identified by greetings such as ‘Dear Customer’ or ‘Dear [emailprotected]’, and so on. » It is important to run anti-phishing toolbars on web browsers (this includes tablets and mobile phones) since these will alert the user to malicious websites contained in an email. » Look out for https and/or the green padlock symbol in the address bar (both suggest that traffic to and from the website is encrypted). » Regularly check online accounts and frequently change passwords. » Ensure an up-to-date browser, with all of the latest security upgrades, is running, and run a good firewall in the background at all times. A combination of a desktop firewall (usually software) and a network firewall (usually hardware) considerably reduces risk. 165
457591_06_CI_AS & A_Level_CS_159-177.indd 165
25/04/19 9:35 AM
6 Security, privacy and data integrity
6
» Be wary of pop-ups – use the web browser to block them; if pop-ups get through your defences, do not click on ‘cancel’ since this often leads to phishing or pharming sites – the best option is to select the small X in the top right hand corner of the pop-up window, which closes it down.
Pharming Pharming is malicious code installed on a user’s computer or on a web server. The code re-directs the user to a fake website without their knowledge (the user does not have to take any action, unlike phishing). The creator of the malicious code can gain personal data such as bank details from users. Often, the website appears to belong to a trusted company and can lead to fraud or identity theft. Why does pharming pose a threat to data security? Pharming redirects users to a fake or malicious website set up by, for example, a hacker. Redirection from a legitimate website can be done using DNS cache poisoning. Every time a user types in a URL, their web browser contacts the DNS server. The IP address of the website is then sent back to their web browser. However, DNS cache poisoning changes the real IP address values to those of the fake website consequently, the user’s computer connects to the fake website. Pharmers can also send malicious programming code to a user’s computer. The code is stored on the HDD without their knowledge. Whenever the user types in the website address of the targeted website, the malicious programming code alters the IP address sent back to their browser which redirects it to the fake website. Protection against pharming It is possible to mitigate the risk of pharming by » using antivirus software, which can detect unauthorised alterations to a website address and warn the user » using modern web browsers that alert users to pharming and phishing attacks » checking the spelling of websites » checking for https and/or the green padlock symbol in the address bar.
It is more difficult to mitigate risk if the DNS server itself has been infected (rather than the user’s computer).
EXTENSION ACTIVITY 6C Pharmers alter IP addresses in order to send users to fake websites. However, the internet does not only have one DNS server. Find out how a user’s internet service provider (ISP) uses its own DNS servers which cache information from other internet DNS servers.
166
457591_06_CI_AS & A_Level_CS_159-177.indd 166
25/04/19 9:35 AM
6.1.4 Data recovery This section covers the potential impact on data caused by accidental mal-operation, hardware malfunction and software malfunction.
6
In each case, the method of data recovery and safeguards to minimise the risk are considered. n use back-ups in case the data is lost
or corrupted through an accidental operation n save data on a regular basis n use passwords and user IDs to restrict access to authorised users only n use back-ups in case data is lost or
hardware fault (such as head crash on the HDD)
software fault (for example, incompatible software installed on the system) incorrect computer operation (for example, incorrect shutdown or procedure for removing memory stick)
6.1 Data security
accidental loss of data (for example, accidental deletion of a file)
corrupted through the hardware fault n use uninterruptable power supply (USP) in case power loss causes hardware malfunction n save data on a regular basis n use parallel systems as back-up hardware n use back-ups in case the data is lost or
corrupted through the software fault n save data on a regular basis in case the software suddenly ‘freezes’ or ‘crashes’ while the user is working on it n use back-ups in case data is lost or
corrupted through wrong operation
n correct training procedures so users
are aware of the correct operation of hardware
▲ Figure 6.5 Safeguards
In all cases, the backing up of data regularly (automatically and/or manually at the end of the day) onto another medium (such as cloud storage, or removable HDD) is key to data recovery. The back-up should be stored in a separate location in case of, for example, a fire or an office break-in. Somebody should be given the role of carrying out back-ups, to ensure it it always done. Backing up data may not be a suitable method of recovery in the case of a virus infection, as the backed up data may contain strands of the virus which could re-infect the ‘cleaned’ computer.
167
457591_06_CI_AS & A_Level_CS_159-177.indd 167
25/04/19 9:35 AM
6 Security, privacy and data integrity
6
ACTIVITY 6A 1 A company has offices in four different countries. Communication and data sharing between the offices is done via computers connecting over the internet. Describe three data security issues the company might encounter during their day to day communications and data sharing. For each issue described, explain why it could be a threat to the security of the company. For each issue described, describe a way to mitigate the threat which has been posed. 2 Define these three terms. a) Worm b) Logic bomb c) Trojan horse 3 John works for a car company. He maintains the database which contains all the personal data of the people working for the car company. John was born on 28 February 1990 and has two pet cats called Felix and Max. a) John needs to use a password and a username to log onto the database. Why would the following three passwords not be a good choice? i) 280290 ii) FeLix1234 iii) John04 b) Describe how John could improve his passwords. How should he maintain his passwords to maximise database security? c) When John enters a password on his computer, he is presented with the following question on screen. Would you like to save the password on this device?
Why is it important that John always says ‘no’ to this question? d) John frequently orders goods from an online company called NILE.com. He opens an email which purports to be from NILE.com. Dear NILE.com user This is to confirm your recent order for: 01230123 A level Computer Science Workbook, $15.90 If this is not your order, please click on the following link and update your details: [emailprotected] Thank you. Customer services.
Explain why John should be suspicious of the email. Include, in your explanation, the type of security threat identified by this email.
168
457591_06_CI_AS & A_Level_CS_159-177.indd 168
25/04/19 9:35 AM
6.2 Data integrity WHAT YOU SHOULD ALREADY KNOW Key terms
Try these three questions before you read the second part of this chapter. 1 Look at the following validation screen from a spreadsheet.
Why is it important to have validation in applications such as spreadsheets?
Validation – method used to ensure entered data is reasonable and meets certain input criteria. Verification – method used to ensure data is correct by using double entry or visual checks.
6.2 Data integrity
Data integrity – the accuracy, completeness and consistency of data.
6
Check digit – additional digit appended to a number to check if entered data is error free. Modulo-11 – method used to calculate a check digit based on modulus division by 11. Checksum – verification method used to check if data transferred has been altered or corrupted, calculated from the block of data to be sent. Parity check – method used to check if data has been transferred correctly that uses even or odd parity. Parity bit – an extra bit found at the end of a byte that is set to 1 if the parity of the byte needs to change to agree with sender/ receiver parity protocol. Odd parity – binary number with an odd number of 1-bits. Even parity – binary number with an even number of 1-bits. Parity block – horizontal and vertical parity check on a block of data being transferred.
2 Why is proofreading not the same as verification? 3 Discuss one way online form designers can ensure that only certain data can be input by a user. Use the date: 12 March 2019 as the example.
Data stored on a computer should always be accurate, consistent and up to date. Two of the methods used to ensure data integrity are validation and verification. The accuracy (integrity) of data can be compromised » during the data entry and data transmission stages » by malicious attacks on the data, for example caused by malware and hacking » by accidental data loss caused through hardware issues.
These risks – together with ways of mitigating them – are discussed in the rest of this chapter.
6.2.1 Validation Validation is a method of checking if entered data is reasonable (and within a given criteria), but it cannot check if data is correct or accurate. For example, if somebody accidentally enters their age as 62 instead of 26, it is reasonable 169
457591_06_CI_AS & A_Level_CS_159-177.indd 169
25/04/19 9:35 AM
6 Security, privacy and data integrity
6
Key terms
but not accurate or correct. Validation is carried out by computer software; the most common types are shown in Table 6.1.
Parity byte – additional byte sent with transmitted data to enable vertical parity checking (as well as horizontal parity checking) to be carried out.
Validation test
Description
Example of data failing validation test
Example of data passing validation test
type
checks whether non-numeric data has been input into a numeric-only field
typing sk.34 in a field which should contain the price of an item
typing 34.50 in a field which should contain the price of an item
Automatic repeat request (ARQ) – a type of verification check.
range
checks whether data entered is between a lower and an upper limit
typing in somebody’s age as −120
typing in somebody’s age as 48
format
checks whether data typing in the date as has been entered in the 12-12-20 where the agreed format format is dd/mm/yyyy
typing in the date as 12/12/2020 where the format is dd/mm/yyyy
length
checks whether data has the required number of characters or numbers
typing in a telephone number as 012 345 678 when it should contain 11 digits
typing in a telephone number as 012 345 678 90 when it should contain 11 digits
presence
checks to make sure a field is not left empty when it should contain data
please enter passport number:………………
please enter passport number: AB 1234567 CD
existence
checks if data in a file or a file name actually exists
data look up for car registration plate A123 BCD which does not exist
data look up for a file called books_in_stock which exists in a database
limit check
Checks only one of the limits (such as the upper limit OR the lower limit)
typing in age as −25 where the data entered should not be negative
typing in somebody’s age as 72 where the upper limit is 140
Acknowledgement– message sent to a receiver to indicate that data has been received without error. Timeout – time allowed to elapse before an acknowledgement is received.
consistency checks whether data check in two or more fields match up correctly
typing in Mr in the title typing in Ms in the field and then choosing title field and then female in the sex field choosing female in the sex field
uniqueness check
choosing the user name MAXIMUS222 in a social networking site but the user name already exists
checks that each entered value is unique
choosing the website name Aristooo.com which is not already used
▲ Table 6.1 Common validation
6.2.2 Verification Verification is a way of preventing errors when data is entered manually (using a keyboard, for example) or when data is transferred from one computer to another.
Verification during data entry When data is manually entered into a computer it needs to undergo verification to ensure there are no errors. There are three ways of doing this: double entry, visual check and check digits. 170
457591_06_CI_AS & A_Level_CS_159-177.indd 170
25/04/19 9:35 AM
Double entry Data is entered twice, using two different people, and then compared (either after data entry or during the data entry process).
6
Visual check Entered data is compared with the original document (in other words, what is on the screen is compared to the data on the original paper documents).
» an incorrect digit being entered (such as 8190 instead of 8180) » a transposition error where two numbers have been swapped (such as 8108 instead of 8180) » digits being omitted or added (such as 818 or 81180 instead of 8180) » phonetic errors such as 13 (thirteen) instead of 30 (thirty).
6.2 Data integrity
Check digits The check digit is an additional digit added to a number (usually in the rightmost position). They are often used in barcodes, ISBNs (found on the cover of a book) and VINs (vehicle identification number). The check digit can be used to ensure the barcode, for example, has been correctly inputted. The check digit can catch errors including
Figure 6.6 shows a barcode with an ISBN-13 code with check digit. 9780340983829
▲ Figure 6.6 Barcode
An example of a check digit calculation is modulo-11. The following algorithm is used to generate the check digit for a number with seven digits: 1 Each digit in the number is given a weighting of 7, 6, 5, 4, 3, 2 or 1, starting from the left. 2 The digit is multiplied by its weighting and then each value is added to make a total. 3 The total is divided by 11 and the remainder subtracted from 11. 4 The check digit is the value generated; note if the check digit is 10 then Xisused. For example: Seven digit number: Weighting values: Sum:
Total: Divide total by 11: subtract remainder from 11: final number:
4 1 5 6 7 1 0 7 6 5 4 3 2 1 (7 × 4) + (6 × 1) + (5 × 5) + (4 × 6) + (3 × 7) + (2 × 1) + (1 × 0) = 28 + 6 + 25 + 24 + 21 + 2 + 0 = 106 9 remainder 7 11 – 7 = 4 (check digit) 4 1 5 6 7 1 0 4
When this number is entered, the check digit is recalculated and, if the same value is not generated, an error has occurred. For example, if 4 1 5 7 6 1 0 4 was entered, the check digit generated would be 3, indicating an error.
171
457591_06_CI_AS & A_Level_CS_159-177.indd 171
25/04/19 9:35 AM
6 Security, privacy and data integrity
6
EXTENSION ACTIVITY 6D 1 Find out how the ISBN-13 method works and confirm that the number 978 034 098 382 has a check digit of 9. 2 Find the check digits for the following numbers using both modulo-11 and ISBN-13. a) 213 111 000 428 b) 909 812 123 544 3 Find a common use for the modulo-11 method of generating check digits.
Verification during data transfer When data is transferred electronically from one device to another, there is always the possibility of data corruption or even data loss. A number of ways exist to minimise this risk. Checksums A checksum is a method to check if data has been changed or corrupted following data transmission. Data is sent in blocks and an additional value, the checksum, is sent at the end of the block of data. To explain how this works, we will assume the checksum of a block of data is 1 byte in length. This gives a maximum value of 28 − 1 (= 255). The value 0000 0000 is ignored in this calculation. The following explains how a checksum is generated. If the sum of all the bytes in the transmitted block of data is ≤ 255, then the checksum is this value. However, if the sum of all the bytes in the data block >255, then the checksum is found using the following simple algorithm. In the example we will assume the value of X is 1185. ① (X = 1185): 1185/256 = 4.629
②
Rounding down to nearest whole number gives Y = 4
③
Multiplying by 256 gives Z = Y * 256 = 1024
④
The difference (X – Z) gives the checksum: (1185 – 1024) = 161
⑤ This gives the checksum: 161
divide the sum, X, of the bytes by 256 round the answer down to the nearest whole number, Y Z = Y * 256
calculate the difference (X – Z) the value is the checksum
▲ Figure 6.7
When a block of data is about to be transmitted, the checksum for the bytes is first calculated. This value is transmitted with the block of data. At the receiving end, the checksum is re-calculated from the block of data received. This calculated value is compared to the checksum transmitted. If they are the same, then the data was transmitted without any errors; if they are different, then a request is sent for the data to be re-transmitted. 172
457591_06_CI_AS & A_Level_CS_159-177.indd 172
25/04/19 9:35 AM
Parity checks A parity check is another method to check whether data has been changed or corrupted following transmission from one device or medium to another.
6
A byte of data, for example, is allocated a parity bit. This is allocated before transmission. Systems that use even parity have an even number of 1-bits; systems that use odd parity have an odd number of 1-bits. Consider the following byte: 1
1
1
parity bit
▲ Figure 6.8
If this byte is using even parity, then the parity bit needs to be 0 since there is already an even number of 1-bits (in this case, four).
6.2 Data integrity
1
If odd parity is being used, then the parity bit needs to be 1 to make the number of 1-bits odd. Therefore, the byte just before transmission would be: either (even parity):
1
1
1
1
1
1
1
1
parity bit
or (odd parity):
1
parity bit ▲ Figure 6.9
Before data is transferred, an agreement is made between sender and receiver regarding which of the two types of parity are used. This is an example of a protocol.
EXTENSION ACTIVITY 6E Find the parity bits for each of the following bytes: 1 1 1 0 1 1 0 1 even parity used 2 0 0 0 1 1 1 1 even parity used 3 0 1 1 1 0 0 0 even parity used 4 1 1 1 0 1 0 0 odd parity used 5 1 0 1 1 0 1 1 odd parity used
If a byte has been transmitted from ‘A’ to ‘B’, and even parity is used, an error would be flagged if the byte now had an odd number of 1-bits at the receiver’s end. For example: Sender’s byte:
1
1
1
1
1
1
1
parity bit
Receiver’s byte:
parity bit ▲ Figure 6.10 173
457591_06_CI_AS & A_Level_CS_159-177.indd 173
25/04/19 9:35 AM
6 Security, privacy and data integrity
6
In this case, the receiver’s byte has three 1-bits, which means it now has odd parity, while the byte from the sender had even parity (four 1-bits). This means an error has occurred during the transmission of the data. The error is detected by the computer re-calculating the parity of the byte sent. If even parity has been agreed between sender and receiver, then a change of parity in the received byte indicates that a transmission error has occurred.
EXTENSION ACTIVITY 6F 1 Which of the following bytes have an error following data transmission? a) 1 1 1 0 1 1 0 1 even parity used b) 0 1 0 0 1 1 1 1 even parity used c) 0 0 1 1 1 0 0 0 even parity used odd parity used d) 1 1 1 1 0 1 0 0 e) 1 1 0 1 1 0 1 1 odd parity used 2 In each case where an error occurs, can you work out which bit is incorrect?
Naturally, any of the bits in the above example could have been changed leading to a transmission error. Therefore, even though an error has been flagged, it is impossible to know exactly which bit is in error. One of the ways around this problem is to use parity blocks. In this method, a block of data is sent and the number of 1-bits are totalled horizontally and vertically (in other words, a parity check is done in both horizontal and vertical directions). As the following example shows, this method not only identifies that an error has occurred but also indicates where the error is. In this example, nine bytes of data have been transmitted. Agreement has been made that even parity will be used. Another byte, known as the parity byte, has also been sent. This byte consists entirely of the parity bits produced by the vertical parity check. The parity byte also indicates the end of the block of data. Table 6.2 shows how the data arrived at the receiving end: parity bit
bit 2
bit 3
bit 4
bit 5
bit 6
bit 7
bit 8
byte 1
1
1
1
1
1
1
byte 2
1
1
1
1
byte 3
1
1
1
1
1
1
byte 4
1
1
byte 5
1
1
1
1
byte 6
1
1
byte 7
1
1
1
1
1
1
byte 8
1
1
1
byte 9
1
1
parity byte
1
1
1
1
▲ Table 6.2
174
457591_06_CI_AS & A_Level_CS_159-177.indd 174
25/04/19 9:35 AM
A careful study of the table shows that » byte 8 (row 8) has incorrect parity (there are three 1-bits) » bit 5 (column 5) also has incorrect parity (there are five 1-bits).
6
First, the table shows that an error has occurred following data transmission. Second, at the intersection of row 8 and column 5, the position of the incorrect bit value (which caused the error) can be found. This means that byte 8 should have been: 0
1
1
which would also correct column 5 giving an even vertical parity (now has four 1-bits). This byte could, therefore, be corrected automatically, as shown above, or an error message could be relayed back to the sender asking them to re-transmit the block of data. One final point; if two of the bits change value following data transmission, it may be impossible to locate the error using the above method.
6.2 Data integrity
For example, using the above example again: 0
1
1
1
1
This byte could reach the destination as: 0
1
1
1
1
1
1
or:
1
1
or:
1
1
1
1
All three are clearly incorrect, but they have retained even parity so will not trigger an error message at the receiving end. Clearly, other methods to complement parity when it comes to error checking of transmitted data are required (such as checksum). Automatic repeat request (ARQ) Automatic repeat request (ARQ) is another method to check data following data transmission. This method can be summarised as follows: » ARQ uses acknowledgement (a message sent to the receiver indicating that data has been received correctly) and timeout (the time interval allowed to elapse before an acknowledgement is received). » When the receiving device detects an error following data transmission, it asks for the data packet to be re-sent. » If no error is detected, a positive acknowledgement is sent to the sender. » The sending device will re-send the data package if – it receives a request to re-send the data, or – a timeout has occurred. » The whole process is continuous until the data packet received is correct or until the ARQ time limit (timeout) is reached. » ARQ is often used by mobile phone networks to guarantee data integrity.
175
457591_06_CI_AS & A_Level_CS_159-177.indd 175
25/04/19 9:35 AM
6 Security, privacy and data integrity
6
ACTIVITY 6B 1 The following block of data was received after transmission from a remote computer; odd parity was being used by both sender and receiver. One of the bits has been changed during the transmission stage. Locate where this error is and suggest a corrected byte value:
byte 1
parity bit
bit 2
bit 3
bit 4
bit 5
bit 6
bit 7
bit 8
1
1
1
byte 2
1
1
1
1
1
1
1
byte 3
1
1
1
byte 4
1
1
1
1
byte 5
1
1
1
1
1
byte 6
1
1
1
byte 7
1
1
1
byte 8
1
byte 9
1
1
1
1
1
parity byte
1
1
1
1
1
2 a) A company is collecting data about new customers and is using an online form to collect the data, as shown below. Describe a suitable validation check for each of the four groups of fields. ①
Name of person
②
Date of birth
③
Telephone number
④
Title Sex
Female:
Male:
b) Explain the differences between validation and verification. Why are both methods used to maintain the integrity of data? 3 A shopkeeper is populating a database containing information about goods for sale in their shop. They are entering the data manually, using both validation and verification to ensure the integrity of the entered data. Here is an example of a record: A21516BX
25
205.50
03334445556
code of the item NXXXXNN (N = letter; X = digit)
number in stock (1–100)
unit cost in dollars
telephone number of supplier of item
a) Describe how verification could be used to ensure the accuracy of the entered data. b) Describe suitable validation checks for all four fields and give examples of data which would fail your chosen validation methods. 176
457591_06_CI_AS & A_Level_CS_159-177.indd 176
25/04/19 9:35 AM
End of chapter questions
byte 1
parity bit
bit 2
bit 3
bit 4
bit 5
bit 6
bit 7
bit 8
1
1
1
byte 2
1
1
1
1
1
byte 3
1
1
1
byte 4
1
1
1
byte 5
1
1
1
1
1
1
byte 6
1
1
1
byte 7
1
1
1
1
1
byte 8
1
1
1
1
1
byte 9
1
1
1
1
1
parity byte
1
1
1
6 6.2 Data integrity
1 A college is using a local area network (LAN) to access data from a database. a) Give two security measures to protect the data on the college’s computer system.[2] b) Data regarding new students joining the college is being entered into the database. Each student has a 7-digit identification number (ID). A check digit is used as a form of checking to ensure errors have not been made when entering the ID numbers. The verification routine uses modulo-11 with the check digit as the eighth (right-most) digit. The weightings used to calculate the check digit are: 7, 6, 5, 4, 3, 2 and 1; the value 7 is the multiplier for the left-most digit. The ID number is: 1 5 6 3 4 1 2 Calculate the check digit. [4] c) Name and describe two validation checks that could be carried out on the student ID number. [4] 2 a) Explain what antivirus software is and how it can be used to ensure data security.[4] b) Explain how a firewall can be used to identify illegal attempts at accessing a computer system and how they can be used to keep data safe. [4] 3 The following block of data was received after transmission from a remote computer. Odd parity was being used by both sender and receiver. One of the bits has been changed during the transmission stage. Locate where this error is and suggest a corrected byte value. [5]
177
457591_06_CI_AS & A_Level_CS_159-177.indd 177
25/04/19 9:35 AM
7
7 Ethics and ownership
Ethics and ownership In this chapter, you will learn about ★ ★ ★ ★ ★
the need for and purpose of ethics as a computer science professional the need to act ethically at all times the impact of acting ethically or unethically in a given situation the need for copyright legislation the different types of software licensing, including free software, open source software, shareware and commercial software ★ the impact of artificial intelligence (AI) on social, economic and environmental issues.
WHAT YOU SHOULD ALREADY KNOW Try these four questions before you read this chapter. 1 a) What is meant by an expert system? b) Name four components of a typical expert system. c) Give three examples of the use of an expert system. 2 a) What is meant by copyright? b) Why is copyright important? c) Give examples of items which would be covered by copyright laws. d) Differentiate between the terms plagiarism and copyright.
3 a) What impact do computers have on the general public with regards to i) jobs/employment ii) the environment iii) how we shop and bank iv) human interactions? b) Describe three positive aspects of the impact of computers on society. 4 What is the influence of social media on a) news reporting b) world safety c) personal and private lives of people d) politics?
178
457591_07_CI_AS & A_Level_CS_178-195.indd 178
4/30/19 7:52 AM
7.1 Legal, moral, ethical and cultural implications
7
Key terms Privacy – the right to keep personal information and data secret and for it to not be unwillingly accessed or shared through, for example, hacking. Plagiarism – the act of taking another person’s work and claiming it as one’s own. BCS – British Computer Society. IEEE – Institute of Electrical and Electronics Engineers. ACM – Association for Computing Machinery.
The following definitions are important when considering ethical behaviour: » Legal covers the law, whether or not an action is punishable by law. » Morality concerns questions of right and wrong, and is more often thought of in relation to personal or individual choices. » Ethics also concerns questions of right and wrong, but is more often used in a professional context. » Culture refers to the attitudes, values and practices shared by a society or group of people.
7.1 Legal, moral, ethical and cultural implications
Legal – relating to, or permissible by, law. Morality – an understanding of the difference between right and wrong, often founded in personal beliefs. Ethics – moral principles governing an individual’s or organisation’s behaviour, such as a code of conduct. Culture – the attitudes, values and practices shared by a group of people/society. Intellectual property rights – rules governing an individual’s ownership of their own creations or ideas, prohibiting the copying of, for example, software without the owner’s permission.
Anything which breaks the law is termed illegal. Examples include copying software and then selling it without the permission of the copyright holders (see Section 7.2). Morality is the human desire to distinguish between right and wrong. This varies from person to person, and between cultures (something that is considered immoral in one culture, may be acceptable practice in another, for example). Immoral does not mean something is illegal (and vice versa). Creating a fake news website, for example, is not illegal, but it may be considered immoral if it causes distress to others. If the creator tried to obtain personal and financial data, then it would be become an illegal act. Similarly, hacking is generally regarded as immoral, but not illegal. However, it becomes illegal if it compromises national security, or results in financial gain, or reveals personal information, for example. In short, there is a fine line between an immoral act and an illegal act. Unethical behaviour is the breaking of a code of conduct. For example, if somebody works for a software company and passes on some ideas to a rival company, this would be regarded as unethical behaviour. If the software is related to national security or is formally copyrighted, then it is also illegal. It is essential to be clear whether any law has been broken. The importance of culture is less tangible. When writing computer games, for example, programmers need to be careful that they do not include items which some cultures would find offensive or obscene. Again, this may not be unethical or illegal, but could still cause distress. It is important to realise that boundaries can easily be crossed; in some countries making fun of religion, for example, is illegal. 179
457591_07_CI_AS & A_Level_CS_178-195.indd 179
25/04/19 9:52 AM
7 Ethics and ownership
7
7.1.1 Computer ethics Computer ethics is a set of principles set out to regulate the use of computers. Three factors are considered: » Intellectual property rights, for example, copying of software without the permission of the owner. » Privacy issues, for example, hacking or any illegal access to another person’s personal data. » Effect of computers on society, for example, job losses, social impacts, and so on.
Internet use has led to an increase in plagiarism – this is when a person takes another person’s idea or work and claims it was their own. While it is fine to quote another person’s idea, it is essential that some acknowledgement is made so that the originator of the idea or work is known to others. This can be done by a series of references at the end of a document or footnotes on each page where a reference needs to be made. Software exists that can scan text and then look for examples of plagiarism by searching web pages on the internet.
7.1.2 Professional ethical bodies There are a number of professional bodies representing individuals working in the fields of computing and information technology that have developed their own codes of conduct, to which members are expected to adhere. Belonging to one of these organisations demonstrates your professional integrity by showing that you are committed to upholding the standards they prescribe.
The British Computer Society (BCS) The British Computer Society (BCS) is a professional body set up in the UK, initially to represent the rights and ethical practices of all professionals working in the IT and computing industries. It is now an international body which works in close partnership with other groups to monitor and advise IT practices across the globe. The BCS Code of Conduct (www.bcs.org/category/6030) covers four main areas: 1 2 3 4
The Public Interest Professional Competence and Integrity Duty to Relevant Authority Duty to the Profession
The Institute of Electrical and Electronics Engineers (IEEE) The Institute of Electrical and Electronics Engineers (IEEE) was set up in the USA with the aims of » raising awareness of ethical issues » promoting ethical behaviour among professionals working in the electronics industry » ensuring engineers and scientists respect the need for ethical behaviour.
180
457591_07_CI_AS & A_Level_CS_178-195.indd 180
25/04/19 9:52 AM
To help in this aim, the IEEE has also set out a code of ethics:
7 7.1 Legal, moral, ethical and cultural implications
IEEE Code of Ethics We, the members of the IEEE, in recognition of the importance of our technologies in affecting the quality of life throughout the world, and in accepting a personal obligation to our profession, its members, and the communities we serve, do hereby commit ourselves to the highest ethical and professional conduct and agree: 1 to hold paramount the safety, health, and welfare of the public, to strive to comply with ethical design and sustainable development practices, and to disclose promptly factors that might endanger the public or the environment; 2 to avoid real or perceived conflicts of interest whenever possible, and to disclose them to affected parties when they do exist; 3 to be honest and realistic in stating claims or estimates based on available data; 4 to reject bribery in all its forms; 5 to improve the understanding by individuals and society of the capabilities and societal implications of conventional and emerging technologies, including intelligent systems; 6 to maintain and improve our technical competence and to undertake technological tasks for others only if qualified by training or experience, or after full disclosure of pertinent limitations; 7 to seek, accept, and offer honest criticism of technical work, to acknowledge and correct errors, and to credit properly the contributions of others; 8 to treat fairly all persons and to not engage in acts of discrimination based on race, religion, gender, disability, age, national origin, sexual orientation, gender identity, or gender expression; 9 to avoid injuring others, their property, reputation, or employment by false or malicious action; 10 to assist colleagues and co-workers in their professional development and to support them in following this code of ethics.
Jointly with the Association for Computing Machinery (ACM), the IEEE has also developed a set of eight principles which govern the code of ethics specifically among software engineers. The principles set out to ensure all engineers meet an acceptable and consistent code of ethics. There are certain expectations of the scientists and engineers from the general public as well as from their peers. The actual eight principles behind the code of ethics and professional practice were published way back in 1999. An abridged version is shown below; a full version can be found at: www.computer.org/web/education/code-of-ethics Software Engineering Code of Ethics 1 PUBLIC – Software engineers shall act consistently with the public interest (contains 8 sub-clauses). 2 CLIENT AND EMPLOYER – Software engineers shall act in a manner that is in the best interests of their client and employer consistent with the public interest (contains 9 sub-clauses). 3 PRODUCT – Software engineers shall ensure that their products and related modifications meet the highest professional standards possible (contains 15 sub-clauses). 4 JUDGEMENT – Software engineers shall maintain integrity and independence in their professional judgement (contains 6 sub-clauses). 5 MANAGEMENT – Software engineering managers and leaders shall subscribe to and promote an ethical approach to the management of software development and maintenance (contains 12 sub-clauses). 6 PROFESSION – Software engineers shall advance the integrity and reputation of the profession consistent with the public interest (contains 13 sub-clauses). 7 COLLEAGUES – Software engineers shall be fair to and supportive of their colleagues (contains 8 sub-clauses). 8 SELF – Software engineers shall participate in life-long learning regarding the practice of their profession and shall promote an ethical approach to the practice of the profession (contains 9 sub-clauses). 181
457591_07_CI_AS & A_Level_CS_178-195.indd 181
25/04/19 9:52 AM
7 Ethics and ownership
7
There are 80 clauses and sub-clauses in total. We shall consider one scenario and see how it fits into a selection of the clauses. Mikhail works during the day for a software company called EthicalGamz developing new software in a number of applications. Mikhail is part of a large team of software engineers writing and testing new code. The team also do market research to help in their development of new software for the future. Much of the work is commercially sensitive and multiple layers of access exist to protect the company from unauthorised sharing of data. In the evenings and at the weekend, Mikhail works for his own company, MikhailSoft, which produces software available to buy on the internet only. To save costs, Mikhail uses coding he helped develop for EthicalGamz in his own software. He also outsources some of the work to software engineers in other countries where the wages are much lower and ethics policies are more lax. This saves him a lot of time and money when producing his own software. Mikhail does not pay any licensing fees to EthicalGamz and makes no reference to any code used from that company in his own products.
We will now consider the ethical implications of the above scenario using the following sub-clauses from the Software Engineering Code of Ethics. 1.03 approve software only if they have a well-founded belief that it is safe, meets the specification and passes the appropriate tests and does not diminish the quality of life, diminish privacy or harm the environment;
There is an ethical issue here since the software written by personnel from other countries may not meet the specification requirements or appropriate tests. It could lead to any of the three factors being violated, for example, the software may contain spyware of which Mikhail is unaware. 2.02 not knowingly use software that is obtained or retained either illegally or unethically;
Mikhail has no control over the coding being developed by his overseas team, furthermore, using the coding from EthicalGamz is illegal use. 3.05 ensure an appropriate method is used for any project on which they work or propose to work;
Using external companies (in his own country or overseas) may be used at various steps in the production of Mikhail’s own software. Unless he applies good managerial control, he will be unable to ensure methods used in projects are appropriate or fully ethical in their implementation. 4.02 only endorse documents either prepared under their supervision or within their areas of competence and with which they are in agreement;
Documentation produced by third party developers is not produced under Mikhail’s direct supervision, indeed some of the work done overseas may be outside Mikhail’s sphere of knowledge which probably removes his ability to objectively endorse the external work being done. 5.03 ensure that software engineers know the employer’s policies and procedures for protecting passwords, files and information that is confidential to the employer or to others; 182
457591_07_CI_AS & A_Level_CS_178-195.indd 182
25/04/19 9:52 AM
By using software developed by EthicalGamz for his own use, Mikhail may need to give passwords and access to other files to engineers working for his own company, MikhailSoft. This would allow non-authorised personnel access to files and information stored on EthicalGamz computer systems leading to a potential security breach.
7
6.05 not protect their own interest at the expense of the profession, client or employer;
7.03
credit fully the work of others and refrain from taking undue credit;
By using coding from EthicalGamz illegally and unethically, and by making no reference to the source of his ‘illegal’ code, Mikhail is effectively taking full credit for all the work done by his colleagues. 8.07
do not give unfair treatment to anyone because of any irrelevant prejudices
Mikhail may dismiss overseas workers who do not agree with his own political or religious beliefs and such dismissals would be deemed unfair and break this code of practice.
EXTENSION ACTIVITY 7A
7.1 Legal, moral, ethical and cultural implications
By using coding from EthicalGamz, Mikhail is enhancing his own interests at the expense of the company and his colleagues at that company.
Using the example above, consider the following eight sub-clauses and decide how (or if) Mikhail is breaking the code of ethics in each case. 1.01
accept full responsibility for their own work
2.03 use the property of a client or employer only in ways properly authorised, and with the client’s or employer’s knowledge and consent 3.03 identify, define and address ethical, economic, cultural, legal and environmental issues related to work projects 4.04 not engage in deceptive financial practices such as bribery, double billing or other improper financial practices 5.02 ensure that software engineers are informed of standards before being held to them 6.08 take responsibility for detecting, correcting, and reporting errors in software and associated documents on which they work 7.02
assist colleagues in their professional development
8.09 recognise that personal violations of this Code are inconsistent with being a professional software engineer
7.1.3 Impact on the public Figure 7.1 summarises the potential impact of any software or hardware being developed on the general public.
183
457591_07_CI_AS & A_Level_CS_178-195.indd 183
25/04/19 9:52 AM
7
health and safety concerns
in the public interest
7 Ethics and ownership
PUBLIC WELL BEING
benefits to the public
concerns of the public
▲ Figure 7.1 Potential impact of software or hardware being developed on the general public
While software engineers and scientists consider the Software Engineering Code of Ethics, the impact on the general public cannot be ignored. This section begins by considering three instances in which computer hardware or software led to expensive errors, which impacted on the general public. LA airport shutdown in 2007 In this example, aeroplanes at LA airport (in the USA) were grounded due to a simple software issue: a faulty network card in a device continued to send incorrect data over the airport’s network. Eventually, the whole of the USA Customs and Borders Agency came to an abrupt standstill at LA airport. This resulted in all flights leaving and landing at the airport being cancelled for about eight hours until the fault was cleared. It cost several million US dollars in lost revenue to the aeroplane operators. The impact on the general public was cancellation of holidays, loss of business and general frustration.
Exploding laptop computers in 2008 Japan holds an annual trade show displaying the latest in computer technology. In 2008, during the trade show, a number of Dell laptop computers burst into flames under the full view of the visiting public and television cameras. The problem was traced back to faulty batteries in the laptops which had been overheating and eventually exploded and burst into flames. As if this was not enough, the problem escalated when Apple reported similar problems with some of its tablets, laptops and desktop computers. Some 100 million computer devices had to be recalled at an estimated cost of over 300 million US dollars to the manufacturers. The impact on the general public would have been devastating if this problem had not been discovered before the devices were generally available to buy.
Airbus A380 incompatible software issue in 2006 In Europe, Airbus Industries uses a number of factories throughout Europe where the design, development and construction of aeroplanes takes place. During 2006, while the new A380 was being developed, a surprising issue came to life: the software in two factories would not ‘talk to each other’. The factory in Hamburg (Germany) was using an old version of CATIA design software while another plant in Toulouse (France) was using the latest version of CATIA software. When a part of the A380 from Hamburg and a part of the A380 from Toulouse were brought together for assembly, the wiring in the two parts did not match up (the cables could not be linked together). This was all due to the fact that the two versions of the software produced different design specifications for the wiring. It cost the company millions of Euros to redo the design and remanufacturing of parts where old software was still in use. Fortunately, this was not a safety issue, but if some other design incompatibility had occurred after assembly of an A380, the effect could have been catastrophic leading to possible loss of life.
184
457591_07_CI_AS & A_Level_CS_178-195.indd 184
25/04/19 9:52 AM
All of these examples are cost-related, but still had – or potentially had – an impact on the general public. Regrettably, there are many other examples. Other issues which can affect the general public and businesses include
EXTENSION ACTIVITY 7B Bearing in mind some of the issues raised above, consider these two questions. 1 Should we police the internet to stop certain activities taking place? 2 Should governments have the power to close down websites (such as Twitter or Facebook) which do not remove hate mail, incitements to violence or unacceptable photographs from their sites?
7 7.1 Legal, moral, ethical and cultural implications
» companies selling software systems which do not meet the required standard for security (inadequate protection against hacking, spyware and other security issues) » the covering up of security issues (such as the XEN security threat which forced several cloud servers to become compromised – an attempt was made to cover up the issue but the affected cloud operators had to come clean) » the release of private data (such as the celebrity photo leaks, when a cloud server was hacked) » social media not policing subversive activity, such as hate mail and cyber bullying. Such activity is undergoing close scrutiny by several countries around the world » search engines giving results at the top of the search due to donations to the search engine operators.
ACTIVITY 7A 1 Describe why it is necessary to produce a code of ethics to cover the computing and electronics industries. 2 Mariam and Asma were having a discussion about whether or not the internet should be policed. Mariam was in favour of the argument and put forward two reasons. ① It would prevent illegal material being posted on websites, such as racist comments, pornography, terrorist activities and so on.
② Some form of control would prevent children and other vulnerable groups being subjected to undesirable websites. Asma was against the argument and put forward two of her own reasons. ① Material published on websites is already available from other sources.
② Policing would go against freedom of information and freedom of speech. Put forward your own arguments and discuss whether you think Mariam’s or Asma’s reasons are valid. 3 Describe the main differences between the terms: legal, morality, ethics and culture. Give examples of each.
185
457591_07_CI_AS & A_Level_CS_178-195.indd 185
25/04/19 9:52 AM
7
7.2 Copyright issues Key terms Piracy – the practice of using or making illegal copies of, for example, software.
7 Ethics and ownership
Product key – security method used in software to protect against illegal copies or use. Digital rights management (DRM) – used to control the access to copyrighted material. Free Software Foundation – organisation promoting the free distribution of software, giving users the freedom to run, copy, change or adapt the coding as needed. Open Source Initiative – organisation offering the same freedoms as the Free Software Foundation, but with
more of a focus on the practical consequences of the four shared rules, such as more collaborative software development. Freeware – software that can be downloaded free of charge; however, it is covered by the usual copyright laws and cannot be modified; nor can the code be used for another purpose. Shareware – software that is free of charge initially (free trial period). The full version of the software can only be downloaded once the full fee for the software has been paid.
7.2.1 Software copyright and privacy Software is protected by copyright laws in much the same way as music CDs, videos and articles from magazines and books are protected. When software is purchased, there are certain rules that must be obeyed: » It is illegal to make a software copy and sell it or give it away. » Software cannot be used on a network or used on multiple computers without a multi-use licence. » It is illegal to use coding from copyrighted software in your own software– and then pass this software on or sell it as your own – without the permission of the copyright holder. » It is illegal to rent out a software package without permission to do so. » It is illegal to use the name of copyrighted software on other software without agreement to do so.
Software piracy (making illegal copies of software) is a major issue among software companies. They take many steps to stop the illegal copying of software and to stop illegal copies being used once they have been sold: » When software is being installed, the user will be asked to key in a unique reference number or product key (a string of letters and numbers) which was supplied with the original copy of the software (for example: 4a3c 0efa 65ab a81e). » The user will be asked to click a button or box which states they agree to the licence agreement before the software continues to install. » The original software packaging often comes with a sticker informing the purchaser that it is illegal to make copies of the software; the label is often in the form of a hologram indicating that this is a genuine copy. » Some software will only run if the CD-ROM, DVD-ROM or memory stick is actually in the drive; this stops illegal multiple use and network use of the software. » Some software will only run if a dongle is plugged into one of the USB ports.
(See also Section 7.2.2 regarding further copyright protection using DRM.) The Federation Against Software Theft (FAST) was set up in the UK to protect the software industry against piracy. FAST prosecutes organisations and individuals involved in any copyright infringements. 186
457591_07_CI_AS & A_Level_CS_178-195.indd 186
25/04/19 9:52 AM
Similar organisations exist in other countries. The following extract from a newspaper article describes a typical example of how strict the anti-piracy laws are in some countries.
7
TRADERS FINED $100 000 Two eBay traders from the United States of America agreed this week to pay a total of $100 000 in damages after they were caught selling illegal copies of Norton security software.
7.2.2 The internet and the World Wide Web (WWW) Digital rights management (DRM) was originally set up to control what devices a CD could play on. Preventing a CD from playing on a computer, for example, would help stop it being copied illegally. DRM has since been updated to cover more areas; it does this by using protection software to help stop the copying of, for example, music tracks, video files or ebooks. DRM creates restrictions that control what the users can do with the data. For example, allowing a music file to be streamed over the internet but not copied, allowing an ebook to be read on a tablet only, or a game requiring an internet connection to a certain website to work, and so on. The aim of DRM is to ensure that any attempt made to break the copyright protection will produce a defective copy which will not work.
7.2 Copyright issues
The SIIA settled the case against the two traders who also agreed to stop selling illegal software and provided SIIA with records identifying their customers and suppliers.
When you buy a product protected by DRM, it may come with a key which licences a single user on one device and this key must be registered. Another example – of which there are many – is Apple Music’s use of DRM layers in streamed music to prevent a user downloading all the music in the first month of a subscription and then cancelling their subscription.
7.2.3 Software licensing Commercial software Commercial software is available to customers for a fee, providing a licence for one genuine copy to be used on a single device, or a multi-use licence for multiple users. Occasionally, software is offered free of charge if an earlier version was bought by the user. This type of software is fully copyright-protected and none of the code can be used without the prior consent of the copyright owner. Free software and the Open Source Initiative The Free Software Foundation and the Open Source Initiative are non-profit organisations that promote the benefits of giving users the freedom to run, copy, change and adapt software. Examples of software licensed in this way include: F-spot (photographic manager), Scribus (DTP/word processor) and LibreOffice (Office Suite). Users are allowed to follow the four freedoms: » Run the software for any legal purpose they wish. » Study the program source code and modify it where necessary to meet their needs. » Redistribute copies of the software to friends and family. » Distribute code modified by the user to friends and family. Users do not need to seek permission to do the above since the software is not protected by copyright restrictions. However, there are still some rules that the user must adhere to. Users cannot » add source code from another piece of software unless this is also described as free software or open source software 187
457591_07_CI_AS & A_Level_CS_178-195.indd 187
25/04/19 9:52 AM
7
» use the source code to produce software which copies existing software which is subject to copyright laws » adapt the source code in such a way that it infringes copyright laws protecting other software » use the source code to produce software which is deemed offensive by third parties.
7 Ethics and ownership
While the two organisations promote the same four freedoms, they have different basic philosophies. Free Software Foundation focuses on what the recipient of the software is permitted to do with the software. Open Source Initiative focuses on the practical consequences offered by the four freedoms; the aims are to provide effective collaboration on software development by the users. There are ten principles that have been developed to ensure the philosophy of the Open Source Initiative is adhered to: 1 Free Redistribution The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale. 2 Source Code The program must include source code and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code there must be a well-publicised means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed. 3 Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software. 4 Integrity of The Author’s Source Code The license may restrict sourcecode from being distributed in modified form only if the license allows the distribution of ‘patch files’ with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software. 5 No Discrimination Against Persons or Groups The license must not discriminate against any person or group of persons. 6 No Discrimination Against Fields of Endeavor The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research. 7 Distribution of License The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties. 8 License Must Not Be Specific to a Product The rights attached to the program must not depend on the program’s being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program’s license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution. 9 License Must Not Restrict Other Software The license must not place restrictions on other software that is distributed along with the licensed
188
457591_07_CI_AS & A_Level_CS_178-195.indd 188
25/04/19 9:52 AM
software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software. 10 License Must Be Technology-Neutral No provision of the license may be predicated on any individual technology or style of interface.
Shareware Shareware allows users to try out some software free of charge for a trial period. At the end of the trial period, the author of the software will request that you pay a fee if you wish to continue using it. Once the fee is paid, a user is registered with the originator of the software and free updates and help are then provided. Often, the trial version of the software is missing some of the features found in the full version, and these do not become available until the fee is paid.
7.3 Artificial intelligence (AI)
Freeware Freeware is software a user can download from the internet free of charge. Once it has been downloaded, there are no fees associated with using the software (examples include: Adobe Reader, Skype and some media players). Unlike free software, freeware is subject to copyright laws and users are often requested to tick a box to say they understand and agree to the terms and conditions governing the software. This means that a user is not allowed to study or modify the source code in any way.
7
This type of software is protected by copyright laws and users must not use the source code in any of their own software without permission.
ACTIVITY 7B 1 a) What is meant by the term software piracy? b) Describe three ways of protecting software against deliberate attempts at making copies to sell or give away. 2 A software company offers a suite of shareware programs. It contains a spreadsheet, word processor, database and drawing package. What are the benefits to the following two stakeholders of offering software packages as shareware? n The company n The customer
7.3 Artificial intelligence (AI) Key term Artificial intelligence (AI) – machine or application which carries out a task that requires some degree of intelligence when carried out by a human counterpart.
7.3.1 What is AI? Artificial intelligence (AI) is a machine or application which carries out a task that requires some degree of intelligence when carried out by a human being. These tasks could include » the use of a language » carrying out a mathematical calculation or function 189
457591_07_CI_AS & A_Level_CS_178-195.indd 189
25/04/19 9:52 AM
7 Ethics and ownership
7
▲ Figure 7.2 Examples of how AI can be used in every-day life
» recognising a person’s face » the ability to operate machinery, such as a car, an aeroplane or a train » analysing data to predict the outcome of a future event, such as weather forecasting.
AI duplicates human tasks requiring decision-making and problem-solving skills.
7.3.2 The impact of AI People often associate AI with science fiction, fantasy and robots. Numerous films and books fuel this association. The science fiction author, Isaac Asimov, went so far as to produce his own three laws of robotics: 1 A robot may not injure a human through action or inaction. 2 A robot must obey orders given by humans without question. 3 A robot must protect itself unless it conflicts with the two laws above. However, AI goes way beyond robotics. It covers an ever-increasing number of areas, such as » autonomous (driverless) vehicles » artificial limb technology » drones, used to carry out dangerous or unpleasant tasks such as bomb disposal, welding, or entering nuclear disaster areas » climate change predictions » medical procedures, such as eye operations where extreme precision is required.
7.3.3 The impacts of AI on society, the economy and the environment As a result of increasing automation over the next few decades, the human race will need to consider the impacts that AI will have on society, the economy and the environment. So should we all be worried? In this section, we will consider a number of existing AI technologies, plus some predictions for the future, to help stimulate discussions. As mentioned in Section 7.3.2, AI is not just about robots, but covers many areas (this is explored further in Chapter 18, which explores specific AI technologies in more depth). We will look at some of the areas mentioned in Section 7.3.2 in more depth and consider the implications of using AI (the descriptions that follow will mix up benefits and drawbacks – in Activity 7C you will need to consider the overall impact). 190
457591_07_CI_AS & A_Level_CS_178-195.indd 190
25/04/19 9:52 AM
New developments in AI are constantly being announced and you are advised to keep up to date by checking out the many websites that keep an eye on AI development. Below are some of the developments and impacts that are currently expected to be seen in the near future.
History has shown, however, that previous technological advances all ended up creating a net increase in jobs. As automation takes over, jobs on the factory floor are lost, but production becomes much faster and more efficient, thus requiring an increase in the number people doing tasks that the automation process cannot yet do, such as quality control, test driving new vehicles and so on. Technology creates new jobs which are generally more interesting to humans than the manual jobs which are lost. However, history does not always repeat itself, so we need to prepare ourselves for a large reduction in employment and think about how to redistribute wealth so that the overall impact of AI will be positive.
7.3 Artificial intelligence (AI)
Research has predicted that, by 2030, some 600 million jobs will be lost globally and as many as 400 million people will need to retrain or switch jobs – all caused by the inevitable advances in AI. The most likely jobs to be lost are those doing medium- and low-skilled work, but high-skilled jobs (such as hospital technicians, architects, engineers) are also at risk. This could lead to civil unrest with large numbers of young people out of work, with few or no employment prospects, unless they have a sought-after skill.
7
It is predicted that, eventually, 99% of all jobs could be eliminated since the increase in the use of AI is exponential – competition between countries and companies to expand their economies will continue to fuel this growth. One question that might be legitimately asked is, ‘if 99% of jobs disappear, who will build the robots and maintain them?’ To answer that question, let us consider a present-day solution to the question. 3D printers are actually now being designed and made by other 3D printers with no human interaction – the whole process is automatic with AI algorithms in control of the building, design and maintenance of these printers. So, it seems logical that other robots/machines will build and maintain future robots and other AI systems. An increase in AI will leave people with more time to pursue their hobbies and have a better lifestyle. Previous industrial revolutions have led to steep changes in the economies of countries that embrace the new technology. Being left behind is not an economic option but is it a good environmental option? Improvements in AI technology can have a positive impact on the environment. Scientists now have more information than ever about what affects the environment. AI can help by finding patterns and interconnections within the thousands of data sets. This helps scientists make informed predictions about the environment and potential climate change. Since this analysis is very complex, the use of AI systems can speed up this process incredibly and allow the human race to take action much faster than they could by present methods. Here are some potential ways in which AI can help: » AI can help us to conserve natural resources (for example, improve the conservation of water supplies).
191
457591_07_CI_AS & A_Level_CS_178-195.indd 191
25/04/19 9:52 AM
7 Ethics and ownership
7
» Detection of pollution in the air and in the seas using AI is much more accurate, allowing scientists to pinpoint the source(s) of pollution more accurately and much faster. » In the future it could be possible to combine weather forecasting and AI to allow for better predictions about renewable energy resources needed for the next few days. This would lead to a more precise automated renewable energy forecast using solar, tide, thermal and wind energy generation. » AI would allow us to learn from nature’s ecosystems by monitoring and modelling, for example, a river’s ecosystem. This would enable us to gain a better understanding of what can affect the delicate balance of life in the river. Such real-time environmental monitoring would allow us to quickly take remedial action before the affects became irreversible. AI would make this possible due to the ability to analyse vast amounts of very complex (inter-related) data.
We will now look at three particular areas where AI could have a large impact.
Transport Some taxi companies are already looking at the introduction of autonomous (driverless) cars. A customer can call up the taxi using an app on their mobile phone, which also automatically handles the payment. Information about the taxi (such as its location and estimated arrival time) would be sent to the mobile phone until the driverless taxi arrives at the exact pick-up point. There would not be any people anywhere in the chain, with AI systems taking total control. Some car manufacturers are on the brink of actually supplying autonomous vehicles (cars, buses and trucks). This would be much more efficient but would put many drivers out of a job. Criminal justice system Advances in facial recognition systems is making fingerprinting in forensic science almost obsolete. AI is also being used to automate legal work and some courts in the USA have trialled the use of AI to sentence criminals and even decide if a prisoner is eligible for parole. Is this a bad thing? Here are some questions to think about: » Does government use of AI need a warrant to allow online data to be searched for all potential criminal activity? » Can AI be used to listen in to our mobile phone conversations and assess our emails? Social media companies are already coming under pressure in this area – would AI help this or could criminals make use of it to hide criminal activity? » What about legal malpractice – what would be the mechanism to challenge an AI inspired legal decision? » How do we ensure no bias creeps into AI decision making processes? The software being trialled in the USA to determine a prisoner’s suitability for parole is already showing bias against black African Americans. How do we ensure such prejudices by governments and individuals when using AI systems is not allowed to occur?
192
457591_07_CI_AS & A_Level_CS_178-195.indd 192
25/04/19 9:52 AM
Advertising and use of data You may remember the Cambridge Analytica scandal in 2018 which hinged around potential misuse of data obtained from a social media company (nearly 90 million profiles had been used by the company leading some people to believe it had influenced the 2016 USA presidential elections). AI could reduce such occurrences by allowing much closer monitoring. It would need to be very sophisticated and act quickly to have any real impact – human beings certainly could not respond fast enough.
ACTIVITY 7C Look through this chapter on the impacts of AI and produce a short essay or wall display highlighting the pros and cons. Draw a reasoned conclusion and debate the overall impact of AI with your classmates.
7.3 Artificial intelligence (AI)
Algorithms can now tailor advertising aimed at specific people by using AI machine learning – this is done by building personality profiles of every internet and mobile phone user. Data is picked up from search engines, social media and visits to websites – all this data can be analysed by machine learning algorithms (see Chapter 18 for more details).
7
ACTIVITY 7D 1 In 2017, Diane Bryant, the chief operating officer of Google Cloud, claimed that AI can: n help us manage the Earth’s very scarce resources n improve cancer diagnosis using precision medicine leading to customised treatments n lead to improvements in human rights in many countries due to cloud computing, better connectivity and reduced costs in developing faster computers. Describe, with examples, why Ms Bryant’s claims could help people in the future. 2 Give three different examples of AI. For each of your examples, give one benefit and one drawback to the general public.
End of chapter questions
1 Nicolae has joined a software company as a new team manager. During his induction he was given a presentation on the company’s code of conduct and the company’s expected ethical behaviour. He was given hand-outs after the presentation which included the code of conduct and ethical behaviour. a) Explain what is meant by the term ethics.[2] b) Describe the differences between behaving in an unethical manner and in an illegal manner. [3] c) Nicolae joins a team writing new software in a programming language unfamiliar to him. Part of his job will be to visit a client and oversee the team writing the software to meet the client’s requirements. ➔ 193
457591_07_CI_AS & A_Level_CS_178-195.indd 193
25/04/19 9:52 AM
7 Ethics and ownership
7
He has little previous experience of working off-site at the client’s premises, and has to depend on a junior colleague to help him through the process. This makes Nicolae uncomfortable in his role as project manager. After six months with the company, Nicolae will have a meeting with his own line manager. The line manager will check Nicolae’s progress against the IEEE eight principles and code of practice. Nicolae has decided to raise three issues with his line manager. i) Describe three issues he could legitimately raise. [3] ii) State which of the IEEE’s eight principles each issue described in part c) i) comes under. [3] iii) Describe what actions the line manager should take to address the three issues you raised in part c) i). [3] 2 a) Name three types of software licensing. [3] b) For each example, describe three features which identify the differences between them. [3] c) Describe how copyright issues affect each type of named software licensing.[3] 3 a) Computers over the years have been described as first to fifth generation. Identify the generation that is associated with AI. [1] D fourth A first E fifth B second C third b) AI is involved in problem-solving. Identify the term that is used to describe the ‘common sense’ part of problemsolving.[1] D programming A analysis E sampling B critical design C heuristics c) Identify the statement that best describes AI. [1] A inputting knowledge into a computer B programming a computer using an expert’s experiences C playing a strategic game, such as chess D making a machine behave in an intelligent way E using a computer to mimic human behaviour d) Identify the AI process that involves repetition, evaluation and then refinement.[1] A diagnostics D interpretation B fact finding E iteration C heuristics 4 The IEEE Software Engineering Code of Ethics uses eight key principles shown in the right-hand column of the following diagram. Tom is employed as a tester with a software company. He is keen to become a trainee programmer. The middle column in the diagram labels six incidents which have happened to Tom this week. The table that follows the diagram describes each incident.
194
457591_07_CI_AS & A_Level_CS_178-195.indd 194
25/04/19 9:52 AM
PUBLIC Incident A
CLIENT & EMPLOYER
Incident B
PRODUCT
Incident C
JUDGEMENT
Incident D
MANAGEMENT
Incident E
PROFESSION
Incident F
COLLEAGUES
7
ETHICAL BEHAVIOUR
SELF
7.3 Artificial intelligence (AI)
UNETHICAL BEHAVIOUR
Incident Description A
Tom has received some phishing emails. He reported this to the bank they were supposed to come from.
B
Tom has asked his manager if they will pay for him to attend a programming course.
C
Tom is testing beta versions of new games software at work. He copies the software on to CD-Rs and sells them to his friends.
D
Tom has completed the application forms to join the Chartered Institute forIT.
E
Tom finds it difficult to work with one of his colleagues. His way of dealing with this has been to refuse to speak with the colleague.
F
Tom’s manager had considered the testing of a new game was completed. Tom reported to his manager that he thought there were still bugs which needed to be rectified.
a) Copy the diagram above and connect each of the six incidents to either ethical behaviour or unethical behaviour. [2] b) Consider each incident you have identified as ethical behaviour. Indicate the IEE category each incident maps to. [4] Adapted from Cambridge International AS & A Level Computer Science 9608 Paper 12 Q5 November2017
195
457591_07_CI_AS & A_Level_CS_178-195.indd 195
25/04/19 9:52 AM
8
8 Databases
Databases In this chapter, you will learn about ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
the limitations of a file-based approach to storage and retrieval of data the features of a relational database that overcome the limitations of a file-based approach the terminology associated with a relational database model entity-relationship (E-R) diagrams to document database design normalisation to third normal form (3NF) producing a normalised database design for a given set of data or tables the features provided by a database management system (DBMS) the software tools provided by a DBMS the creation and modification of a database structure using a database definition language (DDL) queries and the maintenance of a database using a database manipulation language (DML) using SQL as a DDL and as a DML how to understand a given SQL script how to write an SQL script.
8.1 Database concepts WHAT YOU SHOULD ALREADY KNOW Try these three questions before you read the first part of this chapter. 1 Databases are commonly used to store large amounts of data in a well organised way. Identify three databases that are storing information about you. 2 a) Name three of the most commonly used database management systems. b) Give four benefits of using a database.
3 Relational databases use their own terminology. a) Explain what the terms table, field and record mean. b) Identify and explain the meaning of three or more terms used with relational databases. c) What is a normalised relational database?
Key terms Database – a structured collection of items of data that can be accessed by different applications programs.
Table – a group of similar data, in a database, with rows for each instance of an entity and columns for each attribute.
Relational database – a database where the data items are linked by internal pointers.
Record (database) – a row in a table in a database.
196
457591_08_CI_AS & A_Level_CS_196-216.indd 196
25/04/19 10:01 AM
Field – a column in a table in a database. Tuple – one instance of an entity, which is represented by a row in a table. Entity – anything that can have data stored about it, for example, a person, place, event, thing.
Candidate key – an attribute or smallest set of attributes in a table where no tuple has the same value. Primary key – a unique identifier for a table. It is a special case of a candidate key. Secondary key – a candidate key that is an alternative to the primary key. Foreign key – a set of attributes in one table that refer to the primary key in another table. Relationship – situation in which one table in a database has a foreign key that refers to a primary key in another table in the database. Referential integrity – property of a database that does not contain any values of a foreign key that are not matched to the corresponding primary key.
8
Entity-relationship (E-R) model or E-R diagram – a graphical representation of a database and the relationships between the entities. Normalisation (database) – the process of organising data to be stored in a database into two or more tables andrelationships between the tables,so that data redundancy is minimised. First normal form (1NF) – the status of a relational database in which entities do not contain repeated groups of attributes.
8.1 Database concepts
Attribute (database) – an individual data item stored for an entity, for example, for a person, attributes could include name, address, date of birth.
Index (database) – a data structure built from one or more columns in a database table to speed up searching for data.
Second normal form (2NF) – the status of a relational database in which entities are in 1NF and any non-key attributes depend upon the primary key. Third normal form (3NF) – the status of a relational database in which entities are in 2NF and all non-key attributes are independent. Composite key – a set of attributes that form a primary key to provide a unique identifier for a table.
8.1.1 The limitations of a file-based approach A file is a collection of items of data. It can be structured as a collection of records, where each record is made up of fields containing data about the same ‘thing’. Individual elements of data can be called data items. When a program is used for data processing, the organisation of any records used depends on how the program is written. Records can be fixed or variable in length and each record may also contain information about its structure, for example, the number of fields or the length of the record. If these records are to be processed by another program, that program must be written to use the exact same record structure. If the structure is changed by one program, the other program must be rewritten as well. This can cause problems if updating programs is not carefully managed. For example, a business keeps separate payroll files and sales files. Each file is used by a different application.
197
457591_08_CI_AS & A_Level_CS_196-216.indd 197
25/04/19 10:01 AM
8
Record Structure First Name Second Name Address Phone Number Staff Number
Payroll Program - record description - validation rules - processing code
Record Structure Name Staff Number Target Sales Actual Sales
8 Databases
Sales Program - record description - validation rules - processing code
▲ Figure 8.1 File-based approach
Several problems have occurred using this file-based approach. The name of a member of staff and their staff number are stored twice. The way the staff name is stored is different for each program. If the staff number was changed by the payroll program and not by the sales program, these fields may contain different values for the same member of staff. The fields in the two files are also in a different order: the staff number is the fifth field in the payroll record and the second field in the sales file. A file-based approach is limited because » storage space is wasted when data items are duplicated by the separate applications and some data is redundant » data can be altered by one application and not by another; it then becomes inconsistent » enquiries available can depend on the structure of the data and the software used so the data is not independent.
ACTIVITY 8A Match the problems with the payroll and sales system to the limitations of a file-based approach set out above.
8.1.2 The advantages of a relational database over a file-based approach What is a database? There are many different definitions of a database, such as: … A (large) collection of data items and links between them, structured in a way that allows it to be accessed by a number of different applications programs. The term is also used loosely to describe any collection of data. BCS Glossary of Computing, 14th Edition … An electronic filing cabinet which allows the user to perform various tasks including: adding new empty files, inserting data into existing files, retrieving data from existing files, updating data in existing files and cross-referencing data in files. An Introduction to Database Systems (sixth edition) by CJ Date
More straightforwardly, a database is a structured collection of items of data that can be accessed by different applications programs. Data stored in databases is structured as a collection of records, where each record is made up of fields containing data about the same ‘thing’. A relational database is a database in which the data items are linked by internal pointers. 198
457591_08_CI_AS & A_Level_CS_196-216.indd 198
25/04/19 10:01 AM
Using the same example as previously, a business keeps a database for payroll and sales data. A payroll application is used for the payroll and a sales processing application is used for sales.
8
Tables Design
Validation Rules
Data
8.1 Database concepts
Users and Access Rights
Payroll Application
Sales Application
▲ Figure 8.2 Database approach
The problems that occurred using the file-based approach have been solved. The name of a member of staff and their staff number are only stored once. So, any changes made to the data by the payroll application will be seen by the sales processing application and vice versa. The fields are the same and in the same order. A database approach is beneficial because » storage space is not wasted as data items are only stored once, meaning little or no redundant data » data altered in one application is available in another application, so the data is consistent » enquiries available are not dependent on the structure of the data and the software used, so the data is independent.
8.1.3 Relational database model terminology In order to rigorously define the structure of a relational database we need to be able to understand and use the terminology associated with a relational database. A relational database data structure can look similar to a file-based structure as it also consists of records and fields. A table is a group of similar data, in a database, with rows for each instance of an entity and columns for each attribute. A record is a row in a table in a database. A field is a column in a table in a database. For example, a database of students in a school could contain the table Student with a record for each student that contains the fields First Name, Second Name, Date of Birth and Class ID. First Name
Second Name
Date Of Birth
Class ID
Noor
Baig
09/22/2010
7A
Ahmed
Sayed
06/11/2010
7B
Tahir
Hassan
01/30/2011
7A
← row is a record
↑ column is a field ▲ Table 8.1 Part of a student table 199
457591_08_CI_AS & A_Level_CS_196-216.indd 199
25/04/19 10:01 AM
8 Databases
8
Now data is independent of the program processing it. The terms record and field are also used in file processing, so there is more rigorous terminology used specifically for relational databases. Files of data are replaced by tables, with each row of a table representing a record (a tuple, sometimes called a logical record or an occurrence of an entity). Each column of the table is an attribute that can also be referred to as a field. An entity is anything that can have data stored about it, such as a person, place, event or object. An attribute is an individual data item stored for an entity; to use the same example as before, for a student attributes could include first name, second name, date of birth and class. As stated before, a table is a group of similar data, in a database, with rows for each instance of an entity and columns for each attribute. A tuple is one instance of an entity, which is represented by a row in a table. First Name
Second Name
Date Of Birth
Class ID
Noor
Baig
09/22/2010
7A
Ahmed
Sayed
06/11/2010
7B
Tahir
Hassan
01/30/2011
7A
← each row is a tuple
↑ each column is an attribute ▲ Table 8.2 Part of a table for student entity
Data is shared between applications using the database. In order to ensure the consistency of data updating is controlled or automatic, so that any copies of a data item are changed to the new value. Also, in order to reduce the number of copies of a data item to a minimum, a relational database uses pointers between tables. These pointers are keys that provide relationships between tables. There are different types of keys. » A candidate key is an attribute or smallest set of attributes in a table where no tuple has the same value. » A primary key is a unique identifier for a table, it is a special case of a candidate key. » A secondary key is a candidate key that is an alternative to the primary key. » A foreign key is a set of attributes in one table that refer to the primary key in another table.
For example, a database of chemical elements contains a table Elements with attributes Symbol, Name and Atomic Weight. As all these attributes are unique to each element, all are candidate keys. One of these could be chosen as the primary key, for example Symbol. Then the other two attributes, Name and Atomic Weight, would be secondary keys. all attributes are candidate keys ↓ ↓ ↓ Symbol
Name
Atomic Weight
H
Hydrogen
1.008
Li
Lithium
6.94
Na
Sodium
↑ Symbol is the primary key
↓
22.990
↑ Name and Atomic Weight are secondary keys
▲ Table 8.3 Part of a table of elements 200
457591_08_CI_AS & A_Level_CS_196-216.indd 200
25/04/19 10:01 AM
Most tables have only one candidate key, which is used as the primary key. For example, the student table could have an extra attribute Student ID, which is unique to each student. Student ID
First Name
Second Name
Date Of Birth
S1276
Noor
Baig
09/22/2010
7A
S1277
Ahmed
Sayed
06/11/2010
7B
S2199
Tahir
Hassan
01/30/2011
7A
8
Class ID
▲ Table 8.4 Part of a table for student entity
Relationships A relationship is formed when one table in a database has a foreign key that refers to a primary key in another table in the database. In order to ensure referential integrity the database must not contain any values of a foreign key that are not matched to the corresponding primary key.
8.1 Database concepts
↑ Student ID is the primary key and the candidate key
Most databases include more than one table. For example, a school database could contain the table Student and another table Class that contains the Class ID, the Teacher Name and Location of classroom. Only values for Class ID that are stored in the Class table can be used as the foreign key in the Student table. Student ID
First Name
Second Name
Date Of Birth
S1276
Noor
Baig
09/22/2010
7A
S1277
Ahmed
Sayed
06/11/2010
7B
S2199
Tahir
Hassan
01/30/2011
7A
▲ Table 8.5 Part of a table for student entity
Class ID
Teacher Name
Location
7A
Mr Khan
Floor 2 Room 3
7B
Miss Malik
Floor 2 Room 4
7C
Miss Gill
Floor 2 Room 5
Class ID
↑ Class ID is the foreign key
↑ Class ID is the primary key ▲ Table 8.6 Part of a table for class entity
Relationships can take several forms » one-to-one, 1:1 » one-to-many, 1:m » many-to-one, m:1 » many-to-many, m:m.
201
457591_08_CI_AS & A_Level_CS_196-216.indd 201
25/04/19 10:01 AM
8 Databases
8
The relationship between Student and Class is many-to-one, as one value of the attribute Class ID may appear many times in the Student table but only once in the Class table. In order to speed up searching for data, an index can be used. This is a data structure built from one or more columns in a database table. The Student table could be indexed on Class, Second Name and First Name to provide class lists in alphabetical order of Second Name.
8.1.4 Entity-relationship (E-R) diagrams An E-R diagram can be used to document the design of a database. This provides an easily understandable visual representation of how the entities in a database are related. Student Student ID First Name Second Name Date of Birth Class ID
one class has many students
entity name and attributes Class Class ID Teacher Name Location
▲ Figure 8.3 E-R diagram for school database
Relationships may be mandatory or optional. For example, in a workroom with desks, each employee has one desk, but there could be spare desks. The relationship between desk and employee is zero or one, so this relationship is optional. The relationship between mother and child is mandatory because every mother must have at least one child, so the relationship is one or many. The type of relationship and whether it is mandatory or optional gives the cardinality of the relationship. The cardinality of relationships is shown in Figure 8.4. one many one (and only one) zero or one one or many zero or many
▲ Figure 8.4 Cardinality of relationships
202
457591_08_CI_AS & A_Level_CS_196-216.indd 202
25/04/19 10:01 AM
ACTIVITY 8B The School database will also include the following details about each teacher: n teaching licence number n date of birth n address. List the attributes for this new table. Show the change that should be made to the attributes in the Class table. Draw the new E-R diagram for the three tables in the database.
EXTENSION ACTIVITY 8A
8.1 Database concepts
A teacher can have more than one class. A table Teacher is to be added.
8
In small groups, identify suitable entity relationships for each example of cardinality shown above. Explain your findings to another group or the whole class.
8.1.5 The normalisation process Normalisation is used to construct a relational database that has integrity and in which data redundancy is reduced. Tables that are not normalised will be larger. As more data is stored, it will be harder to update the database when changes are made and more difficult to extract the required data to answer queries. For example, if the School database is held in a single table it could look like this: Student ID
First Name
Second Name
Date Of Birth
Class ID
Location
Teacher Name
Licence Number
Address
Teacher Date Of Birth
S1276
Noor
Baig
09/22/2010
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
S1277
Ahmad
Sayed
06/11/2010
7B
Floor 2 Room 4
Miss Malik
68943
School House 2
12/14/1988
S1299
Tahir
Hassan
01/30/2011
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
▲ Table 8.7
This could cause problems when alterations are made to the records. Every time a new student is added, the teacher’s name, address, licence number, date of birth, and the location of the classroom need to be added as well. If Mr Khan leaves the school and is replaced by another teacher, then every record containing his name and other details needs to be changed. If all the students from Class 7B leave, then all the details about Class 7B will be lost. The rules for normalisation are set out as follows. 1 First normal form (1NF) – entities do not contain repeated groups of attributes. 2 Second normal form (2NF) – entities are in 1NF and any non-key attributes depend upon the primary key. There are no partial dependencies. 3 Third normal form (3NF) – entities are in 2NF and all non-key attributes are independent. The table contains no non-key dependencies. 203
457591_08_CI_AS & A_Level_CS_196-216.indd 203
25/04/19 10:01 AM
When the database is in 3NF, all attributes in a table depend upon the key, the whole key and nothing but the key.
8 Databases
8
The School database also includes subject choices for each student. For this database to be normalised, the process is: Student ID
First Name
Second Date Of Name Birth
Subject Name
Subject Teacher
Class ID
Location
Teacher Name
Licence Number
Address
Teacher Date Of Birth
S1276
Noor
Baig
09/22/2010
Maths, History, Geography
Mr Yee, Miss Wu, Mr Khan
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
S1277
Ahmad
Sayed
06/11/2010
Maths, Science, Geography
Mr Yee, Miss Yo, Mr Khan
7B
Floor 2 Room 4
Miss Malik
68943
School House 2
12/14/1988
S1299
Tahir
Hassan
01/30/2011
Maths, Science, History
Mr Yee, Miss Yo, Miss Wu
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
▲ Table 8.8
First normal form (1NF) The un-normalised School database can be represented as follows. STUDENT(StudentID, FirstName, SecondName, DateOfBirth, SubjectName, SubjectTeacher, SubjectName, SubjectTeacher, SubjectName, SubjectTeacher, ClassID, Location, TeacherName, LicenceNumber, Address, TeacherDateOfBirth). STUDENT is the table name; the attributes are listed in order and the primary key is underlined. The student’s subjects and the subject teacher are the repeating attributes. For the database to be in first normal form, these need to be removed to a separate table and linked to the original table with a foreign key. Student ID
First Name Second Name
Date Of Birth Class ID Location
Teacher Name
Licence Number
Address
Teacher Date Of Birth
S1276
Noor
Baig
09/22/2010
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
S1277
Ahmad
Sayed
06/11/2010
7B
Floor 2 Room 4
Miss Malik
68943
School House 2
12/14/1988
S1299
Tahir
Hassan
01/30/2011
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
Student ID
Subject Name
Subject Teacher
S1276
Maths
Mr Yee
S1276
History
Miss Wu
S1276
Geography
Mr Khan
S1277
Maths
Mr Yee
S1277
Science
Miss Yo
S1277
Geography
Mr Khan
S1299
Maths
Mr Yee
S1299
Science
Miss Yo
S1299
History
Miss Wu
▲ Table 8.9 School database in 1NF 204
457591_08_CI_AS & A_Level_CS_196-216.indd 204
25/04/19 10:01 AM
The School database can now be represented in 1NF as follows. STUDENT(StudentID, FirstName, SecondName, DateOfBirth, ClassID, Location, TeacherName, LicenceNumber, Address, TeacherDateOfBirth).
8
STUDENTSUBJECT(StudentID, SubjectName, SubjectTeacher). The primary key for the STUDENTSUBJECT table is a composite key formed from the two attributes StudentID and SubjectName; the attribute StudentID is also a foreign key that links to the STUDENT table.
Student ID First Name Second Name
Date Of Birth
Class ID Location
Teacher Name
Licence Number
Address
Teacher Date Of Birth
S1276
Noor
Baig
09/22/2010
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
S1277
Ahmad
Sayed
06/11/2010
7B
Floor 2 Room 4
Miss Malik
68943
School House 2
12/14/1988
S1299
Tahir
Hassan
01/30/2011
7A
Floor 2 Room 3
Mr Khan
37952
School House 1
03/27/1985
Student ID
Subject Name
S1276
Maths
S1276
History
S1276
Geography
S1277
Maths
S1277
Science
S1277
Geography
S1299
Maths
S1299
Science
S1299
History
Subject Name
Subject Teacher
Maths
Mr Yee
History
Miss Wu
Geography
Mr Khan
Science
Miss Yo
8.1 Database concepts
Second normal form (2NF) There are now two tables; in the STUDENTSUBJECT table the primary key is a composite key and the SubjectTeacher is only dependent on the SubjectName part of the primary key. This is a partial dependence and needs to be removed by introducing a third table, SUBJECT.
▲ Table 8.10 School database in 2NF
The School database can now be represented in 2NF as follows. STUDENT(StudentID, FirstName, SecondName, DateOfBirth, ClassID, Location, TeacherName, LicenceNumber, Address, TeacherDateOfBirth) STUDENTSUBJECT(StudentID, SubjectName) SUBJECT(SubjectName, SubjectTeacher) 205
457591_08_CI_AS & A_Level_CS_196-216.indd 205
25/04/19 10:01 AM
8
Third normal form (3NF) There are now three tables. In the STUDENT table, the attributes Location and TeacherName depend upon the attribute ClassID and the attributes LicenceNumber, Address and TeacherDateOfBirth depend upon the attribute TeacherName. These are non-key dependencies that need to be removed to ensure that the database is in 3NF.
8 Databases
At this stage it is also worth inspecting the database and its contents to consider any other problems that could arise, such as the following: » Teacher names might not be unique; therefore, it is better to use the licence number as a primary key. » Teachers can be both class teachers and subject teachers; these need to be combined in one table. Student ID
First Name
Second Name
Date Of Birth
Class ID
S1276
Noor
Baig
09/22/2010
7A
S1277
Ahmad
Sayed
06/11/2010
7B
S1299
Tahir
Hassan
01/30/2011
7A
Licence Number
Teacher Name
Address
Teacher Date Of Birth
37952
Mr Khan
School House 1
03/27/1985
68943
Miss Malik
School House 2
12/14/1988
35859
Mr Yee
School House 1
10/07/1985
77248
Miss Yo
School House 2
05/05/1987
72691
Miss Wu
School House 2
11/21/1989
37952
Mr Khan
School House 1
03/27/1985
Class ID
Location
Licence Number
7A
Floor 2 Room 3
37952
7B
Floor 2 Room 4
68943
Student ID
Subject Name
S1276
Maths
S1276
History
S1276
Geography
S1277
Maths
S1277
Science
S1277
Geography
S1299
Maths
S1299
Science
S1299
History
Subject Name
Licence Number
Maths
35859
History
72691
Geography
37952
Maths
77248
▲ Table 8.11 School database in 3NF 206
457591_08_CI_AS & A_Level_CS_196-216.indd 206
25/04/19 10:01 AM
The improved School database can now be represented in 3NF as follows. STUDENT(StudentID, FirstName, SecondName, DateOfBirth,) CLASS(ClassID, Location, LicenceNumber)
8
TEACHER(LicenceNumber, TeacherName, Address, TeacherDateOfBirth) STUDENTSUBJECT(StudentID, SubjectName) SUBJECT(SubjectName, LicenceNumber).
Construct an E-R diagram to represent the database structure of the fully normalised school database shown above.
EXTENSION ACTIVITY 8B
8.1 Database concepts
ACTIVITY 8C
Discuss any other possible problems that could occur with this database. Hint: look at the subject table and think about subjects that could have more than one teacher or different levels. Identify an improved database structure that could solve the problem.
The School database example showed at each stage why the database was not normalised. Here is another example for you to try. A database has been set up as a single table to store employees of a business and their contacts. Part of the database is shown below. Employee Number
Employee Name
Position
Contact Number Contact Name
Contact Email Address
7001
James Tey
Financial Director
28
Mary Jones
[emailprotected]
31
James Smith
[emailprotected]
17
Mishal Hussani
[emailprotected]
19
Mary Cheung
[emailprotected]
27
Dean Knight
[emailprotected]
28
Mary Jones
[emailprotected]
7002
Paul Leigh
Accountant
7011
Suzy Mey
Personnel Manager
▲ Table 8.12 Un-normalised employee database
This table is not in 1NF because there are repeating attributes and the table is not in 3NF because there are non-key dependencies. The employee database can be represented as: EMPLOYEE(EmployeeNumber, EmployeeName, Position, ContactNumber, ContactName, ContactEmailAddress). Where EmployeeNumber is the primary key ContactNumber, ContactName and ContactEmailAddress may be repeated as often as required.
ACTIVITY 8D Normalise the Employee database and show the new tables. Draw the E-R diagram for the normalised database. 207
457591_08_CI_AS & A_Level_CS_196-216.indd 207
25/04/19 10:01 AM
8 Databases
8
ACTIVITY 8E 1 a) i) Describe the limitations of a file-based approach to storage and retrieval of data. ii) Give two benefits of using a database management system. b) A new relational database is to be developed. The developer needs to produce a normalised database design. i) Explain what is meant by normalisation. ii) Describe the process of normalisation. 2 A warehouse stores parts for cars for several manufacturers. A database stores the following data for each part: Part number, part description, date last ordered, minimum order level, manufacturer name, manufacturer address, manufacturer contact details, position in warehouse, number in stock a) Design a fully normalised database for the parts. b) Draw the E-R diagram.
8.2 Database management systems (DBMSs) WHAT YOU SHOULD ALREADY KNOW Try these two questions before you read the second part of this chapter. 1 a) Name a database management system (DBMS) you have used. b) Describe three tasks that you have used the DBMS for.
2 Most DBMSs include back-up procedures and access rights to keep the data secure. a) Describe what is meant by back-up. b) Describe what is meant by access rights. c) How do these features help to keep data secure?
Key terms Database management system (DBMS) – systems software for the definition, creation and manipulation of a database.
Data modelling – the analysis and definition of the data structures required in a database and to produce a data model.
Data management – the organisation and maintenance of data in a database to provide the information required.
Logical schema – a data model for a specific database that is independent of the DBMS used to build that database.
Data dictionary – a set of data that contains metadata (data about other data) for a database.
Access rights (database) – the permissions given to database users to access, modify or delete data.
208
457591_08_CI_AS & A_Level_CS_196-216.indd 208
25/04/19 10:01 AM
Developer interface – feature of a DBMS that provides developers with the commands required for definition, creation and manipulation of a database. Structured query language (SQL) – the standard query language used with
relational databases for data definition and data modification.
8
Query processor – feature of a DBMS that processes and executes queries written in structured query language (SQL).
Data redundancy issue This is solved by storing data in separate linked tables, which reduces the duplication of data as most items of data are only stored once. Items of data used to link tables by the use of foreign keys are stored more than once. The DBMS will flag any possible errors when any attempt is made to accidentally delete this type of item. Data inconsistency issue This is also solved by storing most items of data only once, allowing updated items to be seen by all applications. As data is not inconsistent, the integrity of the data stored is improved. Consistent data is easier to maintain as an item of data will only be changed once, not multiple times, by different applications.
8.2 Database management systems (DBMSs)
8.2.1 How a DBMS addresses the limitations of a file-based approach
Data dependency issue Data is independent of the applications using the database, so changes made to the structure of the data will be managed by the DBMS and have little or no effect on the applications using the database. Any fields or tables added to or removed from the database will not affect the applications that do not use those fields/tables, as each application only has access to the fields/tables it requires. Information from a database is more easily available in a form that is required so it is not dependent on the structure of the data and the application used. A DBMS usually includes facilities to query the data stored using a defined query language or a query-by-example facility.
The DBMS approach A DBMS uses a more structured approach to the management, organisation and maintenance of data in a database. An already-defined data structure can be used to set up and create the database. The entry of new data, the storage of data, the alteration and deletion of data are all managed by the DBMS. A DBMS uses a data dictionary to store the metadata, including the definition of tables, attributes, relationships between tables and any indexing. The data dictionary can also define the validation rules used for the entry of data and contain data about the physical storage of the data. The use of a data dictionary improves the integrity of the data stored, helping to ensure that it is accurate, complete and consistent.
209
457591_08_CI_AS & A_Level_CS_196-216.indd 209
25/04/19 10:01 AM
8
Data modelling is an important tool used to show the data structure of a database. An E-R diagram is an example of a data model. A logical schema is a data model for a specific database that is independent of the DBMS used to build the database.
8 Databases
A DBMS helps to provide data security to prevent the unwanted alteration, corruption, deletion or sharing of data with others that have no right to access it. Security measures taken by a DBMS can include » using usernames and passwords to prevent unauthorised access to the database » using access rights to manage the actions authorised users can take, for example, users could read/write/delete, or read only, or append only » using access rights to manage the parts of the database they have access to, for example, the provisions of different views of the data for different users to allow only certain users access to some tables » automatic creation and scheduling of regular back-ups » encryption of the data stored » automatic creation of an audit trail or activity log to record the actions taken by users of the database.
8.2.2 The use and purpose of DBMS software tools Developer interface The developer interface allows a developer to write queries in structured query language (SQL) rather than using query-by-example. These queries are then processed and executed by the query processor. This allows the construction of more complex queries to interrogate the database. Query processor The query processor takes a query written in SQL and processes it. The query processor includes a DDL interpreter, a DML compiler and a query evaluation engine. Any DDL statements are interpreted and recorded in the database’s data dictionary. DML statements are compiled into low level instructions that are executed by the query evaluation engine. The DML compiler will also optimise the query.
ACTIVITY 8F 1 a) Describe how a DBMS overcomes the limitations of a file-based approach to the storage and retrieval of data. b) Describe how a DBMS ensures that data stored in a database is secure. 2 a) Describe three features provided by a DBMS. b) A school stores timetabling data for all pupils and classes. Which features could a DBMS use to ensure that the administrators, teachers and pupils can only see the information available to them?
210
457591_08_CI_AS & A_Level_CS_196-216.indd 210
25/04/19 10:01 AM
8.3 Data definition language (DDL) and data manipulation language (DML)
8
WHAT YOU SHOULD ALREADY KNOW Try this exercise before you read the third part of this chapter.
Write the following queries using a query-by-example form.
You may want to save this database to practise your SQL commands.
Key terms Data definition language (DDL) – a language used to create, modify and remove the data structures that form a database.
Data manipulation language (DML) – a language used to add, modify, delete and retrieve the data stored in a relational database. SQL script – a list of SQL commands that perform a given task, often stored in a file for reuse.
8.3.1 Industry standard methods for building and modifying a database DBMSs use a data definition language (DDL) to create, modify and remove the data structures that form a relational database. DDL statements are written as a script that uses syntax similar to a computer program.
8.3 Data definition language (DDL) and data manipulation language (DML)
Using a DBMS with a graphical user interface (GUI), create the student database used in Section 8.1.5.
1 A list of all the teachers and their subjects. 2 A list of the pupils in class 7A in alphabetical order of second name. 3 A list of the students studying each subject.
DBMSs use a data manipulation language (DML) to add, modify, delete and retrieve the data stored in a relational database. DML statements are written in a script that is similar to a computer program. These languages have different functions: DDL is used for working on the relational database structure, whereas DML is used to work with the data stored in the relational database. Most DBMSs use structured query language (SQL) for both data definition and data manipulation. SQL was developed in the 1970s and since then it has been adopted as an industry standard.
8.3.2 SQL (DDL) commands and scripts In order to be able to understand and write SQL, you should have practical experience of writing SQL scripts. There are many applications that allow you to do this. For example, MySQL and SQLite are freely available ones. When using any SQL application it is important that you check the commands available to use as these may differ slightly from those listed below. 211
457591_08_CI_AS & A_Level_CS_196-216.indd 211
25/04/19 10:01 AM
8 Databases
8
You will need to be able to understand and use the following DDL commands. SQL (DDL) command
Description
CREATE DATABASE
Creates a database
CREATE TABLE
Creates a table definition
ALTER TABLE
Changes the definition of a table
PRIMARY KEY
Adds a primary key to a table
FOREIGN KEY … REFERENCES …
Adds a foreign key to a table
▲ Table 8.13 DDL commands
You also need to be familiar with the following data types used for attributes in SQL. Data types for attributes
Description
CHARACTER
Fixed length text
VARCHAR(n)
Variable length text
BOOLEAN
True or False; SQL uses the integers 1 and 0
INTEGER
Whole number
REAL
Number with decimal places
DATE
A date usually formatted as YYYY-MM-DD
TIME
A time usually formatted as HH:MM:SS
▲ Table 8.14 Data types for attributes
Here are some examples of DDL that could have been used when the school database was created. CREATE DATABASE School
The database is created first
CREATE TABLE Student( StudentID CHARACTER,
then the table
FirstName CHARACTER, SecondName CHARACTER,
followed by the attributes
DateOfBirth DATE, ClassID CHARACTER); ALTER TABLE Student ADD PRIMARY KEY (StudentID) CREATE TABLE Class( ClassID CHARACTER, Location CHARACTER,
the primary key is added after the table is created; this can also be done during table creation
Licence Number CHRACTER); ALTER TABLE Class ADD PRIMARY KEY (ClassID) ALTER TABLE Student ADD FOREIGN KEY ClassID REFERENCES Class(ClassID)
the foreign key is added after the Class table is created
ACTIVITY 8G Create the Teacher table and add the Licence Number as a foreign key to the Class table. 212
457591_08_CI_AS & A_Level_CS_196-216.indd 212
25/04/19 10:01 AM
8.3.3 SQL (DML) commands and scripts In order to be able to understand and write SQL, you should have practical experience of writing SQL scripts and queries. There are many applications that allow you to do this. Again, MySQL and SQLite are freely available ones. You can also write SQL commands in Access. When using any SQL application, it is important that you check the commands available to use as these may differ slightly from those listed below.
SQL (DML) query command
Description
SELECT FROM
Fetches data from a database. Queries always begin with SELECT.
WHERE
Includes only rows in a query that match a given condition
ORDER BY
Sorts the results from a query by a given column either alphabetically or numerically
GROUP BY
Arranges data into groups
INNER JOIN
Combines rows from different tables if the join condition is true
SUM
Returns the sum of all the values in the column
COUNT
Counts the number of rows where the column is not NUL
AVG
Returns the average value for a column with a numeric data type
SQL (DML) maintenance commands
Description
INSERT INTO
Adds new row(s) to a table
DELETE FROM
Removes row(s) from a table
UPDATE
Edits row(s) in a table
▲ Table 8.15 DML commands
8.3 Data definition language (DDL) and data manipulation language (DML)
You will need to be able to understand and use the following DML commands.
8
Here are some examples of DML that could have been used to query and update the school database. This query will show, in alphabetical order of second name, the first and second names of all students in class 7A: SELECT FirstName, SecondName FROM Student WHERE ClassID = '7A' ORDER BY SecondName This query will show the teacher’s name and the subject taught: SELECT Teacher.TeacherName AND Subject.SubjectName FROM Teacher INNER JOIN Subject ON Teacher. LicenceNumber = Subject.LicenceNumber
213
457591_08_CI_AS & A_Level_CS_196-216.indd 213
25/04/19 10:01 AM
ACTIVITY 8H
8
Create a query to show each student’s First Name, Second Name and the subjects studied by each student.
This statement will insert a row into the Student table: 8 Databases
INSERT INTO Student VALUES(S1301, Peter, Probert, 06/06/2011, 7A) If the values for all the columns are not known, then the table columns need to be specified before the values are inserted: INSERT INTO Student(StudentID, FirstName, SecondName) VALUES(S1301, Peter, Probert) These statements will delete the specified row(s) from the Student table (take care: DELETE FROM Student will delete the whole table!): DELETE FROM Student WHERE StudentID = 'S1301' The values for any column can be counted, totalled or averaged. For example, if an extra column was added to the STUDENTSUBJECT table showing each student’s exam mark in that subject, the following query could be used to total all of the students’ exam marks: SELECT SUM (ExamMark) FROM STUDENTSUBJECT
ACTIVITY 8I Use the SQL statements AVG and COUNT to find the average mark and count how many marks have been recorded.
End of chapter questions
1 A database has been designed to store data about programmers and the programs they have developed. These facts help to define the structure of the database: – Each programmer works in a particular team. – Each programmer has a unique first name. – Each team has one or more programmer. – Each program is for one customer only. – Each programmer can work on any program. – The number of days that each programmer has worked on a program is recorded. The table ProgDev was the first attempt at designing the database.
214
457591_08_CI_AS & A_Level_CS_196-216.indd 214
25/04/19 10:01 AM
Team
ProgramName
NoOfDays
Customer
Alice
WC
TV control Ice alert Digital camera
3 2 6
SKM WZP HNC
Charles
PC
Oil flow Rescue Pack
1 8
GEB BGF
Ahmad
QR
TV control Accounts Digital camera Test Pack
2 8 4 3
SKM ARC HNC GKN
8
a) State why the table is not in first normal form (1NF). [1] b) The database design is changed to: Programmer (FirstName, Team) Program (FirstName, ProgramName, NoOf Days, Customer) Using the data given in the first attempt table (ProgDev), copy and complete these revised table designs to show how these data are now stored. [3] Table: Programmer FirstName
Table: Program FirstName
Team
ProgramName
NoOfDays
Customer
c) i) A relationship between the two tables has been implemented. Explain how this has been done. ii) Explain why the Program table is not in third normal form (3NF). iii) Write the table definitions to give the database in 3NF.
8.3 Data definition language (DDL) and data manipulation language (DML)
FirstName
[2] [2] [2]
➔
215
457591_08_CI_AS & A_Level_CS_196-216.indd 215
25/04/19 10:01 AM
8 Databases
8
2 A school stores a large amount of data. This includes student attendance, qualification, and contact details. The school’s software uses a file-based approach to store this data. a) The school is considering changing to a DBMS. [1] i) State what DBMS stands for. ii) Describe two ways in which the Database Administrator (DBA) could use the DBMS software to ensure the security of the student data. [4] iii) A feature of the DBMS software is a query processor. Describe how the school secretary could use this software. [2] iv) The DBMS has replaced software that used a file-based approach with a relational database. Describe how using a relational database has overcome the previous problems associated with a file-based approach. [3] b) The database design has three tables to store the classes that students attend. STUDENT(StudentID, FirstName, LastName, Year, TutorGroup) CLASS(ClassID, Subject) CLASS-GROUP(StudentID, ClassID) Primary keys are not shown. There is a one-to-many relationship between CLASS and CLASS-GROUP. i) Describe how this relationship is implemented. [2] ii) Describe the relationship between CLASS-GROUP and STUDENT.[1] iii) Write an SQL script to display the StudentID and FirstName of all students who are in the tutor group 10B. Display the list in alphabetical order of LastName.[4] iv) Write an SQL script to display the LastName of all students who attend the class whose ClassID is CS1. [4] Cambridge International AS & A Level Computer Science 9608 Paper 12 Q8 June 2016
216
457591_08_CI_AS & A_Level_CS_196-216.indd 216
25/04/19 10:01 AM
9
Algorithm design and problem solving
★ ★
computational thinking skills (abstraction and decomposition) how to write algorithms that provide solutions to problems using structured English, flowcharts and pseudocode ★ the process of stepwise refinement.
In order to design a computer system that performs a specific task, or solves a given problem, the task or problem has to be rigorously defined and set out, showing what is going to be computed and how it is going to be computed. This chapter introduces tools and techniques that can be used to design a software solution to work with associated computer hardware to form a computer system.
9.1 Computational thinking skills
In this chapter, you will learn about
Practice is essential to develop skills in computational thinking. Designs shown with pseudocode or flowcharts can be traced to check if the proposed solution works, but the best way to actually test that a computer system works is to code it and use it or, even better, get somebody else to use it. Therefore, practical programming activities, alongside other activities, will be suggested at every stage to help reinforce the skills being learnt and develop the skill of programming. The programming languages to use are: » Java
» Python
» VB.NET.
WHAT YOU SHOULD ALREADY KNOW Can you answer these six questions and complete the following activity? 1 What is a procedure? 2 What is a function? 3 What is an algorithm? 4 What is structured English? 5 What is a flowchart? 6 What is pseudocode?
Write an algorithm using a flowchart to find the average of a number of integers. Both the number of values and each integer are to be input, and the average is to be output. Use the flowchart of your algorithm to write the algorithm in pseudocode. Use your pseudocode to write and test a program that includes a function to solve the problem.
9.1 Computational thinking skills Key terms Abstraction – the process of extracting information that is essential, while ignoring what is not relevant, for the provision of a solution.
Decomposition – the process of breaking a complex problem into smaller parts. Pattern recognition – the identification of parts of a problem that are similar and could use the same solution. 217
457591_09_CI_AS & A_Level_CS_217-237.indd 217
26/04/19 7:32 AM
9
Computational thinking is used to study a problem and formulate an effective solution that can be provided using a computer. There are several techniques used in computational thinking, including abstraction, decomposition, algorithms and pattern recognition.
9 Algorithm design and problem solving
9.1.1 Using abstraction Abstraction is an essential part of computational thinking. It enables computer scientists to develop clear models for the solution to complex problems. Abstraction involves extracting information that is essential while ignoring what is not relevant for the provision of a solution, only including what is necessary to solve that problem. Abstraction encourages the development of simplified models that are suited to a specific purpose by eliminating any unnecessary characteristics from that model. Many everyday items use abstraction, such as maps, calendars and timetables. Maps use abstraction to show what is required for a specific purpose, for example, a road map should only show the necessary detail required to drive from one place to another. The road map in Figure 9.1 has reduced the complexity by only showing the essential details needed, such as roads, road numbers and towns, and removing other information about the terrain that would not be helpful (as shown in the satellite view). The benefits of eliminating any unnecessary characteristics from the model include » the time required to develop the program is reduced so the program can be delivered to the customer more quickly » the program is smaller in size so takes up less space in memory and download times are shortened » customer satisfaction is greater as their requirements are met without any extraneous features.
Map Data © 2018 Google, Imagery © 2018 Landsat/Copernicus
▲ Figure 9.1 Road map and satellite view
The first stage of abstraction is to identify the purpose of the model of the situation that is to be built. The situation could be one that occurs in real life, an imaginary one, or a future event such as modeling the route of a deep space probe. Once the purpose has been identified, sources of information need to be identified. These can include observations, views of potential users, and evidence from other existing models. The next stage is to use the information gathered from appropriate sources to identify what details need to be included in the model, how these details should be presented and what details are extraneous and need to be removed from the model. For example, maps are used for many different purposes and can take different forms depending on the identified use. The purpose of the road map model in Figure 9.1 is to allow a driver to plan a journey, therefore, it includes towns and 218
457591_09_CI_AS & A_Level_CS_217-237.indd 218
26/04/19 7:32 AM
roads with their numbers. The roads depicted are a scaled down version of the actual road to help the driver visualise the route. A rail map model has another purpose and, therefore, looks very different, only showing rail lines, stations and perhaps details about accessibility for wheelchair users at different stations. A train passenger has no need to visualise the route, so the rail lines are simplified for clarity.
9
9.1.2 Using decomposition 9.2 Algorithms
Decomposition is also an essential part of computational thinking. It enables computer scientists to break a complex problem into smaller parts that can be further subdivided into even smaller parts until each part is easy to examine and understand, and a solution can be developed for it. When a rigorous decomposition is undertaken, many simple problems are found to be more complex than at first sight. Pattern recognition is used to identify those parts that are similar and could use the same solution. This leads to the development of reusable program code in the form of subroutines, procedures and functions. When writing a computer program, each final part is defined as a separate program module that can be written and tested as a separate procedure or function, as shown in Figure 9.2. Program modules already written and tested can also be identified and reused, thus saving development time. See Chapter 12 for further details. Decomposition Program
Module 1
Module 1.1
Module 2
Module 1.2
Module 2.1
Module 2.2
Module 2.2.1
Module 2.2.2
▲ Figure 9.2 Decomposition of a program into modules
9.2 Algorithms Key terms Structured English – a method of showing the logical steps in an algorithm, using an agreed subset of straightforward English words for commands and mathematical operations. Flowchart – a diagrammatic representation of an algorithm.
Pseudocode – a method of showing the detailed logical steps in an algorithm, using keywords, identifiers with meaningful names, and mathematical operators. Stepwise refinement – the practice of subdividing each part of a larger problem into a series of smaller parts, and so on, as required.
Algorithm – an ordered set of steps to be followed in the completion of a task.
219
457591_09_CI_AS & A_Level_CS_217-237.indd 219
26/04/19 7:32 AM
9.2.1 Writing algorithms that provide solutions to problems
9 Algorithm design and problem solving
9
There are several methods of writing algorithms before attempting to program a solution. Here are three frequently used methods. » Structured English is a method of showing the logical steps in an algorithm, using an agreed subset of straightforward English words for commands and mathematical operations to represent the solution. These steps can be numbered. » A flowchart shows diagrammatically, using a set of symbols linked together with flow lines, the steps required for a task and the order in which they are to be performed. These steps, together with the order, are called an algorithm. Flowcharts are an effective way to show the structure of an algorithm. » Pseudocode is a method of showing the detailed logical steps in an algorithm, using keywords, identifiers with meaningful names and mathematical operators to represent a solution. Pseudocode does not need to follow the syntax of a specific programming language, but it should provide sufficient detail to allow a program to be written in a high-level language.
ACTIVITY 9A You have been asked to write an algorithm for drawing regular polygons of any size. In pairs, divide the problem into smaller parts, identifying those parts that are similar.
Below, you will see the algorithm from the What you should already know section on page 217 written using each of these three methods. Structured English 1 Ask for the number of values 2 Loop that number of times 3 Enter a value in loop 4 Add the value to the Total in loop 5 Calculate and output average
Pseudocode
Write down your solution as an algorithm in structured English.
Total ← 0
Swap your algorithm with another pair.
FOR Counter ← 1 TO Number
Test their algorithm by following their instructions to draw a regular polygon. Discuss any similarities and differences between your solutions.
PRINT "Enter the number of values to average" INPUT Number PRINT "Enter value" INPUT Value Total ← Total + Value NEXT Counter Average ← Total / Number PRINT "The average of ", Number, " values is ", Average
220
457591_09_CI_AS & A_Level_CS_217-237.indd 220
26/04/19 7:32 AM
Flowchart Start
9
Total = 0 Counter = 1
INPUT Number
9.2 Algorithms
OUTPUT "Enter the number of values to average"
OUTPUT "Enter value"
INPUT Value
Total = Total + Value Counter = Counter + 1
No
Counter > Number? Yes Average = Total/Number
OUTPUT "The average of ", Number, " values is ", Average
End
▲ Figure 9.3
9.2.2 Writing simple algorithms using pseudocode Each line of pseudocode is usually a single step in an algorithm. The pseudocode used in this book follows the rules in the Cambridge International AS & A Level Computer Science Pseudocode Guide for Teachers and is set out using a fixed width font and indentation, where required, of four spaces, except for THEN, ELSE and CASE clauses that are only indented by two spaces. All identifier names used in pseudocode should be meaningful; for example, the name of a person could be stored in the variable identified by Name. They should also follow some basic rules: they should only contain the characters A–Z, a–z and 0–9, and should start with a letter. Pseudocode identifiers are usually considered to be case insensitive, unlike identifiers used in a programming language. 221
457591_09_CI_AS & A_Level_CS_217-237.indd 221
26/04/19 7:32 AM
It is good practice to keep track of any identifiers used in an identifier table, such as Table 9.1.
9
Identifier name StudentName
Description
Counter
Store a loop counter
StudentMark
Store a student mark
Store a student name
9 Algorithm design and problem solving
▲ Table 9.1
Pseudocode statements to use for writing algorithms. To input a value: INPUT StudentName To output a message or a value or a combination: OUTPUT "You have made an error" OUTPUT StudentName OUTPUT "Student name is ", StudentName To assign a value to a variable (the value can be the result of a process or a calculation): Counter ← 1 Counter ← Counter + 1 MyChar ← "A" LetterValue ← ASC(MyChar) StudentMark ← 40 Percentage ← (StudentMark / 80) * 100 Oldstring ← "Your mark is" NewString ← OldString & " ninety-seven" Operators used in pseudocode assignment statements:
ACTIVITY 9B Identify the values stored in the variables when the assignment statements in the example above have all been completed. The function ASC returns the ASCII value of a character.
+ Addition - Subtraction * Multiplication / Division &
String concatenation
← Assignment
To perform a selection using IF statements for a single choice or a choice and an alternative, and CASE statements when there are multiple choices or multiple choices and an alternative:
222
457591_09_CI_AS & A_Level_CS_217-237.indd 222
26/04/19 7:32 AM
IF – single choice
IF – single choice with alternative
IF MyValue > YourValue
IF MyValue > YourValue
THEN
THEN
OUTPUT "I win"
9
OUTPUT "I win"
ENDIF
ELSE ENDIF
CASE – multiple choices
CASE – multiple choices with alternative
CASE OF Direction
CASE OF Direction
"N": Y ← Y + 1
"N": Y ← Y + 1
"S": Y ← Y – 1
"S": Y ← Y – 1
"E": X ← X + 1
"E": X ← X + 1
"W": X ← X – 1
"W": X ← X – 1
ENDCASE
9.2 Algorithms
OUTPUT "You win"
OTHERWISE : OUTPUT "Error" ENDCASE Relational operators used in pseudocode selection statements: =
Equal to
Not equal to
>
Greater than
>
Less than
>=
Greater than or equal to
yourValue:
The colon indicates the start of the THEN clause. All statements in the THEN clause are indented as shown
9 Algorithm design and problem solving
print ("I win")
VB 'IF - single choice VB Module Module1 Sub Main() Dim myValue, yourValue As Integer Console.Write("Please enter my value ") myValue = Integer.Parse(Console.ReadLine()) Console.Write("Please enter your value ") yourValue = Integer.Parse(Console.ReadLine()) If myValue > yourValue Then
Use of THEN and END IF
Console.WriteLine("I win") Console.ReadKey() 'wait for keypress End If End Sub End Module
Java //IF - single choice Java import java.util.Scanner; class IFProgram { public static void main(String args[]) { Scanner myObj = new Scanner(System.in); System.out.println("Please enter my value "); int myValue = myObj.nextInt(); System.out.println("Please enter your value "); int yourValue = myObj.nextInt(); if (myValue > yourValue)
{} are used to show the start and end of the THEN clause
{ System.out.println("I win"); } } }
224
457591_09_CI_AS & A_Level_CS_217-237.indd 224
4/30/19 7:53 AM
ACTIVITY 9C
9 9.2 Algorithms
1 In the programming language you have chosen to use, write a short program to input MyValue and YourValue and complete the single choice with an alternative IF statement shown on page 224. Note any differences in the command words you need to use and the construction of your programming statements compared with the pseudocode. 2 In the programming language you have chosen to use, write a short program to set X and Y to zero, input Direction and complete the multiple choice with an alternative CASE statement shown on page 224 and output X and Y. Note any differences in the command words you need to use and the construction of your programming statements compared to the pseudocode.
To perform iteration using FOR, REPEAT–UNTIL and WHILE loops: Total ← 0
FOR Counter ← 1 TO 10 STEP 2
FOR Counter ← 1 TO 10
OUTPUT Counter
OUTPUT "Enter a number "
NEXT Counter
INPUT Number Total ← Total + Number NEXT Counter OUTPUT "The total is ", Total A FOR loop has a fixed number of repeats, the STEP increment is an optional expression that must be a whole number. REPEAT OUTPUT "Please enter a positive number " INPUT Number UNTIL Number > 0 Statements in a REPEAT loop are always executed at least once. Number ← 0 WHILE Number >= 0 DO OUTPUT "Please enter a negative number " INPUT Number ENDWHILE Statements in a WHILE loop may sometimes not be executed. Programming languages may not always use the same iteration constructs as pseudocode, so it is important to be able to write a program that performs the same task as a solution given in pseudocode. 225
457591_09_CI_AS & A_Level_CS_217-237.indd 225
26/04/19 7:32 AM
Here are three programs to demonstrate a simple FOR loop, one in each of the three prescribed programming languages. Note the construction of the FOR statement, as it is different from the pseudocode.
9
Python # FOR - simple loop Python for Counter in range (1,10,2):
9 Algorithm design and problem solving
print(Counter)
The colon indicates the start of the FOR loop. All statements in the FOR loop are indented as shown
VB 'FOR - simple loop VB Module Module1 Sub Main() Dim Counter As Integer For Counter = 1 To 10 Step 2 Console.WriteLine(Counter)
ACTIVITY 9D In the programming language you have chosen to use, write a short program to perform the same tasks as the other three loops shown in pseudocode. Note any differences in the command words you need to use, and the construction of your programming statements compared to the pseudocode.
Use of STEP and NEXT
Next Console.ReadKey() 'wait for keypress End Sub End Module
Java //FOR - simple loop Java class FORProgram { public static void main(String args[]) { for (int Counter = 1; Counter 0) AND (Number < 50)
This can be written in pseudocode by making use of the function INT(x) that returns the integer part of x:
ACTIVITY 9E In pseudocode, write statements to check that a number input is between 10 and 20 or over 100. Make use of brackets to ensure that the order of the comparisons is clear.
9.2 Algorithms
A simple algorithm can be clearly documented using these statements. A more realistic algorithm to find the average of a number of integers input would include checks that all the values input are whole numbers and that the number input to determine how many integers are input is also positive.
Total ← 0 REPEAT PRINT "Enter the number of values to average" INPUT Number UNTIL (Number > 0) AND (Number = INT(Number)) FOR Counter ← 1 TO Number REPEAT PRINT "Enter an integer value " INPUT Value UNTIL Value = INT(Value) Total ← Total + Value NEXT Counter Average ← Total / Number PRINT "The average of ", Number, " values is ", Average The identifier table for this algorithm is presented in Table 9.2. Identifier name Total
Description
Number
Number of integer values to enter
Value
Integer value input
Average
Average of all the integer values entered
Running total of integer values entered
▲ Table 9.2
Here are three programs to find the average of a number of integers input, one in each of the three prescribed programming languages. Note the construction of the loops, as they are different from the pseudocode. All the programming languages check for an integer value.
227
457591_09_CI_AS & A_Level_CS_217-237.indd 227
26/04/19 7:32 AM
9
Python # Find the average of a number of integers input Python Total = 0 Number = int(input("Enter the number of values to average ")) while Number upperBound)
451
457591_19_CI_AS & A_Level_CS_450-497.indd 451
26/04/19 9:14 AM
19
IF found THEN OUTPUT "Item found" ELSE OUTPUT "Item not found"
19 Computational thinking and problem solving
ENDIF Identifier myList
Description Array to be searched
upperBound
Upper bound of the array
lowerBound
Lower bound of the array
index
Pointer to current array element
item
Item to be found
found
Flag to show when item has been found
▲ Table 19.1
This method works for a list in which the items can be stored in any order, but as the size of the list increases, the average time taken to retrieve an item increases correspondingly. The Cambridge International AS & A Level Computer Science syllabus requires you to be able to write code in one of the following programming languages: Python, VB and Java. It is very important to practice writing different routines in the programming language of your choice; the more routines you write, the easier it is to write programming code that works. Here is a simple linear search program written in Python, VB and Java using a FOR loop. Python #Python program for Linear Search #create array to store all the numbers myList = [4, 2, 8, 17, 9, 3, 7, 12, 34, 21] #enter item to search for item = int(input("Please enter item to be found ")) found = False for index in range(len(myList)): if(myList[index] == item): found = True if(found): print("Item found") else: print("Item not found")
452
457591_19_CI_AS & A_Level_CS_450-497.indd 452
26/04/19 9:14 AM
VB 'VB program for Linear Search Module Module1
19
Public Sub Main() Dim index As Integer Dim item As Integer Dim found As Boolean Dim myList() As Integer = New Integer() {4, 2, 8, 17, 9, 3, 7, 12, 34, 21} 'enter item to search for Console.Write("Please enter item to be found ") item = Integer.Parse(Console.ReadLine())
19.1 Algorithms
'Create array to store all the numbers
For index = 0 To myList.Length - 1 If (item = myList(index)) Then found = True End If Next If (found) Then Console.WriteLine("Item found") Else : Console.WriteLine("Item not found") End If Console.ReadKey() End Sub End Module
Java //Java program Linear Search import java.util.Scanner; public class LinearSearch { public static void main(String args[]) { Scanner myObj = new Scanner(System.in); //Create array to store the all the numbers int myList[] = new int[] {4, 2, 8, 17, 9, 3, 7, 12, 34, 21}; int item, index; boolean found = false; // enter item to search for System.out.println("Please enter item to be found "); item = myObj.nextInt(); for (index = 0; index < myList.length - 1; index++)
453
457591_19_CI_AS & A_Level_CS_450-497.indd 453
26/04/19 9:14 AM
{
19
if (myList[index] == item) { found = true; } }
19 Computational thinking and problem solving
if (found)
{ System.out.println("Item found"); }
else
{
System.out.println("Item not found");
}
}
}
ACTIVITY 19A Write the linear search in the programming language you are using, then change the code to use a similar type of loop that you used in the pseudocode at the beginning of Section 19.1.1, Linear search.
Binary search A binary search is more efficient if a list is already sorted. The value of the middle item in the list is first tested to see if it matches the required item, and the half of the list that does not contain the required item is discarded. Then, the next item of the list to be tested is the middle item of the half of the list that was kept. This is repeated until the required item is found or there is nothing left to test. For example, consider a list of the letters of the alphabet. A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
Y
Z
To find the letter W using a linear search there would be 23 comparisons. A B C D E F G H I J = = = = = = = = = = W W W W W W W W W W 1 2 3 4 5 6 7 8 9 10
K = W 11
L = W 12
M = W 13
N = W 14
O = W 15
P = W 16
Q = W 17
R = W 18
S = W 19
T = W 20
U = W 21
V = W 22
W X = W 23
▲ Figure 19.1 Linear search showing all the comparisons
454
457591_19_CI_AS & A_Level_CS_450-497.indd 454
26/04/19 9:14 AM
To find the letter W using a binary search there could be just three comparisons. A
B
C
D
E
F
G
H
I
J
K
L
M = W 1
N
O
P
Q
R
S
T = W 2
U
V
W = W 3
X
Y
Z
ACTIVITY 19B Check how many comparisons for each type of search it takes to find the letter D. Find any letters where the linear search would take less comparisons than the binary search.
19.1 Algorithms
▲ Figure 19.2 Binary search showing all the comparisons
19
A binary search usually takes far fewer comparisons than a linear search to find an item in a list. For example, if a list had 1024 elements, the maximum number of comparisons for a binary search would be 16, whereas a linear search could take up to 1024 comparisons. Here is the pseudocode for the binary search algorithm to find if an item is in the populated 1D array myList. The identifier table is the same as the one used for the linear search. DECLARE myList : ARRAY[0:9] OF INTEGER DECLARE upperBound : INTEGER DECLARE lowerBound : INTEGER DECLARE index : INTEGER DECLARE item : INTEGER DECLARE found : BOOLEAN upperBound ← 9 lowerBound ← 0 OUTPUT "Please enter item to be found" INPUT item found ← FALSE REPEAT index ← INT ( (upperBound + lowerBound) / 2 ) IF item = myList[index] THEN found ← TRUE ENDIF IF item > myList[index] THEN
455
457591_19_CI_AS & A_Level_CS_450-497.indd 455
26/04/19 9:14 AM
lowerBound ← index + 1 ENDIF IF item < myList[index] THEN upperBound ← index - 1 ENDIF UNTIL (found = TRUE) OR (lowerBound = upperBound) IF found THEN OUTPUT "Item found" ELSE OUTPUT "Item not found" ENDIF
19 Computational thinking and problem solving
19
Identifier myList
Description Array to be searched
upperBound
Upper bound of the array
lowerBound
Lower bound of the array
index
Pointer to current array element
item
Item to be found
found
Flag to show when item has been found
▲ Table 19.2
The code structure for a binary search is very similar to the linear search program shown for each of the programming languages. You will need to populate myList before searching for an item, as well as the variables found, lowerBound and upperBound. You will need to use a conditional loop like those shown in the table below. Loop while (not found) and (lowerBound != lowerBound):
Language Python uses a condition to repeat the loop at the start of the loop
Do : : Loop Until (found) Or (lowerBound = upperBound)
VB uses a condition to stop the loop at the end of the loop
Do { : : } while ((!found) && (upperBound != lowerBound));
Java uses a condition to repeat the loop at the end of the loop
▲ Table 19.3
456
457591_19_CI_AS & A_Level_CS_450-497.indd 456
26/04/19 9:14 AM
You will need to use If statements like those shown in the table below to test if the item is found, or to decide which part of myList to use next, and to update the upperBound or lowerBound accordingly. Language Python using integer division
index = (upperBound + lowerBound)\2 If (item = myList(index)) Then found = True End If If item > myList(index) Then lowerBound = index + 1 End if If item < myList(index) Then upperBound = index -1 End if
VB using integer division
index = (upperBound + lowerBound) / 2; if (myList[index] == item) { found = true; } if (item > myList[index]) { lowerBound = index + 1; } if (item < myList[index]) { upperBound = index - 1; }
Java automatic integer division
19.1 Algorithms
If index = (upperBound + lowerBound)//2) if(myList[index] == item): found = True if item > myList[index]: lowerBound = index + 1 if item < myList[index]: upperBound = index - 1
19
▲ Table 19.4
ACTIVITY 19C In your chosen programming language, write a short program to complete the binary search. Use this sample data: 16, 19, 21, 27, 36, 42, 55, 67, 76, 89 Search for the values 19 and 77 to test your program.
457
457591_19_CI_AS & A_Level_CS_450-497.indd 457
26/04/19 9:14 AM
19
19.1.2 Understanding insertion and bubble sorting methods Bubble sort In Chapter 10, we looked at the bubble sort method of sorting a list. This is a method of sorting data in an array into alphabetical or numerical order by comparing adjacent items and swapping them if they are in the wrong order.
19 Computational thinking and problem solving
The bubble sort algorithm and identifier table to sort the populated 1D array myList from Chapter 10 is repeated here. DECLARE myList : ARRAY[0:8] OF INTEGER DECLARE upperBound : INTEGER DECLARE lowerBound : INTEGER DECLARE index : INTEGER DECLARE swap : BOOLEAN DECLARE temp : INTEGER DECLARE top : INTEGER upperBound ← 8 lowerBound ← 0 top ← upperBound REPEAT FOR index = lowerBound TO top - 1 Swap ← FALSE IF myList[index] > myList[index + 1] THEN temp ← myList[index] myList[index] ← myList[index + 1] myList[index + 1] ← temp swap ← TRUE ENDIF NEXT top ← top -1 UNTIL (NOT swap) OR (top = 0)
458
457591_19_CI_AS & A_Level_CS_450-497.indd 458
Identifier myList
Description Array to be searched
upperBound
Upper bound of the array
lowerBound
Lower bound of the array
index
Pointer to current array element
swap
Flag to show when swaps have been made
top
Index of last element to compare
temp
Temporary storage location during swap
▲ Table 19.5
26/04/19 9:14 AM
Here is a simple bubble sort program written in Python, VB and Java, using a pre-condition loop and a FOR loop in Python and post-condition loops and FOR loops in VB and Java. Python
19
#Python program for Bubble Sort myList = [70,46,43,27,57,41,45,21,14] swap = True
Pre-condition loop
while (swap) or (top > 0): swap = False for index in range(top - 1):
19.1 Algorithms
top = len(myList)
if myList[index] > myList[index + 1]: temp = myList[index] myList[index] = myList[index + 1] myList[index + 1] = temp swap = True top = top - 1 #output the sorted array print(myList)
VB 'VB program for bubble sort Module Module1 Sub Main() Dim myList() As Integer = New Integer() {70, 46, 43, 27, 57, 41, 45, 21, 14} Dim index, top, temp As Integer Dim swap As Boolean top = myList.Length - 1 Do swap = False For index = 0 To top - 1 Step 1 If myList(index) > myList(index + 1) Then temp = myList(index) myList(index) = myList(index + 1) myList(index + 1) = temp swap = True End If
459
457591_19_CI_AS & A_Level_CS_450-497.indd 459
26/04/19 9:14 AM
19
Next top = top - 1 Loop Until (Not swap) Or (top = 0)
Post-condition loop
'output the sorted array For index = 0 To myList.Length - 1 19 Computational thinking and problem solving
Console.Write(myList(index) & " ") Next Console.ReadKey() 'wait for keypress End Sub End Module
Java // Java program for Bubble Sort class BubbleSort { public static void main(String args[]) { int myList[] = {70, 46, 43, 27, 57, 41, 45, 21, 14}; int index, top, temp; boolean swap; top = myList.length; do { swap = false; for (index = 0; index < top - 1; index++) { if (myList[index] > myList[index + 1]) { temp = myList[index]; myList[index] = myList[index + 1]; myList[index + 1] = temp; swap = true; } } top = top - 1; }
Post-condition loop
while ((swap) || (top > 0)); // output the sorted array for (index = 0; index < myList.length; index++) System.out.print(myList[index] + " "); System.out.println(); } }
460
457591_19_CI_AS & A_Level_CS_450-497.indd 460
26/04/19 9:14 AM
Insertion sort The bubble sort works well for short lists and partially sorted lists. An insertion sort will also work well for these types of list. An insertion sort sorts data in a list into alphabetical or numerical order by placing each item in turn in the correct position in a sorted list. An insertion sort works well for incremental sorting, where elements are added to a list one at a time over an extended period while keeping the list sorted.
DECLARE myList : ARRAY[0:8] OF INTEGER DECLARE upperBound : INTEGER DECLARE lowerBound : INTEGER
19.1 Algorithms
Here is the pseudocode and the identifier table for the insertion sort algorithm sorting the populated 1D array myList.
19
DECLARE index : INTEGER DECLARE key : BOOLEAN DECLARE place : INTEGER upperBound ← 8 lowerBound ← 0 FOR index ← lowerBound + 1 TO upperBound key ← myList[index] place ← index - 1 IF myList[place] > key THEN WHILE place >= lowerBound AND myList[place] > key temp ← myList[place + 1] myList[place + 1] ← myList[place] myList[place] ← temp place
← place - 1
ENDWHILE myList[place + 1] ← key ENDIF NEXT index Identifier myList
Description Array to be searched
upperBound
Upper bound of the array
lowerBound
Lower bound of the array
index
Pointer to current array element
key
Element being placed
place
Position in array of element being moved
▲ Table 19.6 461
457591_19_CI_AS & A_Level_CS_450-497.indd 461
26/04/19 9:14 AM
19 Computational thinking and problem solving
19
Figure 19.3 shows the changes to the 1D array myList as the insertion sort is completed. myList [0] [1] [2] [3] [4] [5] [6] [7] [8]
1 27 19 36 42 16 89 21 16 55
19 27 36 42 16 89 21 16 55
2 19 27 36 42 16 89 21 16 55
Index of element being checked 3 4 5 6 7 19 19 16 16 16 16 16 16 27 27 19 19 19 19 19 16 36 36 27 27 27 21 21 19 42 42 36 36 36 27 27 21 16 16 42 42 42 36 36 27 89 89 89 89 89 42 42 36 21 21 21 21 21 89 89 42 16 16 16 16 16 16 16 89 55 55 55 55 55 55 55 55
8 16 16 19 21 27 36 42 89 55
16 16 19 21 27 36 42 55 89
▲ Figure 19.3
The element shaded blue is being checked and placed in the correct position. The elements shaded yellow are the other elements that also need to be moved if the element being checked is out of position. When sorting the same array, myList, the insert sort made 21 swaps and the bubble sort shown in Chapter 10 made 38 swaps. The insertion sort performs better on partially sorted lists because, when each element is found to be in the wrong order in the list, it is moved to approximately the right place in the list. The bubble sort will only swap the element in the wrong order with its neighbour. As the number of elements in a list increases, the time taken to sort the list increases. It has been shown that, as the number of elements increases, the performance of the bubble sort deteriorates faster than the insertion sort. 600 bubble sort
Time (seconds)
500
400
300
200
insertion sort
100
quick sort 1000
5000
10000
50000
100000
200000
300000
Number of elements in the list
▲ Figure 19.4 Time performance of sorting algorithms
462
457591_19_CI_AS & A_Level_CS_450-497.indd 462
26/04/19 9:14 AM
The code structure for an insertion sort in each of the programming languages is very similar to the bubble sort program. You will need to assign values to lowerBound and upperBound and use a nested loop like those shown in the table below. Language Python
19.1 Algorithms
Nested loop for index in range(lowerBound + 1, upperBound): key = myList[index] place = index -1 if myList[place] > key: while place >= lowerBound and myList[place] > key: temp = myList[place + 1] myList[place + 1] = myList [place] myList[place] = temp place = place -1 myList[place + 1] = key
19
For index = lowerBound + 1 To upperBound myKey = myList(index) place = index - 1 If myList(place) > myKey Then While (place >= lowerBound) And (myList(place) > myKey) temp = myList(place + 1) myList(place + 1) = myList(place) myList(place) = temp place = place - 1 End While myList(place + 1) = myKey End If Next
VB cannot use key as a variable
for (index = lowerBound + 1; index < upperBound; index++) { key = myList[index]; place = index - 1; if (myList[place] > key) { do { temp = myList[place + 1]; myList[place + 1] = myList[place]; myList[place] = temp; place = place - 1 } while ((place >= lowerBound) || (myList[place + 1] > key)); myList[place + 1] = key; } }
Java
▲ Table 19.7 463
457591_19_CI_AS & A_Level_CS_450-497.indd 463
26/04/19 9:14 AM
19 Computational thinking and problem solving
19
EXTENSION ACTIVITY 19A There are many other more efficient sorting algorithms. In small groups, investigate different sorting algorithms, finding out how the method works and the efficiency of that method. Share your results.
ACTIVITY 19D In your chosen programming language write a short program to complete the insertion sort.
19.1.3 Understanding and using abstract data types (ADTs) Abstract data types (ADTs) were introduced in Chapter 10. Remember that an ADT is a collection of data and a set of operations on that data. There are several operations that are essential when using an ADT » finding an item already stored » adding a new item » deleting an item.
We started considering the ADTs stacks, queues and linked lists in Chapter 10. If you have not already done so, read Section 10.4 to ensure that you are ready to work with these data structures. Ensure that you can write algorithms to set up then add and remove items from stacks and queues.
Stacks In Chapter 10, we looked at the data and the operations for a stack using pseudocode. You will need to be able to write a program to implement a stack. The data structures and operations required to implement a similar stack using a fixed length integer array and separate sub routines for the push and pop operations are set out below in each of the three prescribed programming languages. If you are unsure how the operations work, look back at Chapter 10. Stack data structure stack = [None for index in range(0,10)] basePointer = 0 topPointer = -1 stackFull = 10 item = None
Language Python empty stack with no elements
Public Dim stack() As Integer = {Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing} Public Dim basePointer As Integer = 0 Public Dim topPointer As Integer = -1 Public Const stackFull As Integer = 10 Public Dim item As Integer
VB empty stack with no elements and variables set to public for subroutine access
public public public public public
Java empty stack with no elements and variables set to public for subroutine access
static static static static static
int stack[] = new int[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; int basePointer = 0; int topPointer = -1; final int stackFull = 10; int item;
▲ Table 19.8
464
457591_19_CI_AS & A_Level_CS_450-497.indd 464
26/04/19 9:14 AM
Language
19
Python global used within subroutine to access variables topPointer points to the top of the stack
Sub pop() If topPointer = basePointer - 1 Then Console.WriteLine("Stack is empty, cannot pop") Else item = stack(topPointer) topPointer = topPointer - 1 End If End Sub
VB topPointer points to the top of the stack
static void pop() { if (topPointer == basePointer - 1) System.out.println("Stack is empty,cannot pop"); else { item = stack[topPointer - 1]; topPointer = topPointer - 1; } }
Java topPointer points to the top of the stack
19.1 Algorithms
Stack pop operation def pop(): global topPointer, basePointer, item if topPointer == basePointer -1: print("Stack is empty,cannot pop") else: item = stack[topPointer] topPointer = topPointer -1
▲ Table 19.9
Stack push operation def push(item): global topPointer if topPointer < stackFull - 1: topPointer = topPointer + 1 stack[topPointer] = item else: print("Stack is full, cannot push") Sub push(ByVal item) If topPointer < stackFull - 1 Then topPointer = topPointer + 1 stack(topPointer) = item Else Console.WriteLine("Stack is full, cannot push") End if End Sub
Language Python
VB
➔ 465
457591_19_CI_AS & A_Level_CS_450-497.indd 465
26/04/19 9:14 AM
static void push(int item) { if (topPointer < stackFull - 1) { topPointer = topPointer + 1; stack[topPointer] = item; } else System.out.println("Stack is full, cannot push"); } ▲
19 Computational thinking and problem solving
19
Java
Table 19.10
ACTIVITY 19E In your chosen programming language, write a program using subroutines to implement a stack with 10 elements. Test your program by pushing two integers 7 and 32 onto the stack, popping these integers off the stack, then trying to remove a third integer, and by pushing the integers 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 onto the stack, then trying to push 11 on to the stack.
Queues In Chapter 10, we looked at the data and the operations for a circular queue using pseudocode. You will need to be able to write a program to implement a queue. The data structures and operations required to implement a similar queue using a fixed length integer array and separate sub routines for the enqueue and dequeue operations are set out below in each of the three programing languages. If you are unsure how the operations work, look back at Chapter 10. Queue data structure queue = [None for index in range(0,10)] frontPointer = 0 rearPointer = -1 queueFull = 10 queueLength = 0
Language Python empty queue with no items
Public Dim queue() As Integer = {Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing} Public Dim frontPointer As Integer = 0 Public Dim rearPointer As Integer = -1 Public Const queueFull As Integer = 10 Public Dim queueLength As Integer = 0 Public Dim item As Integer
VB empty queue with no items and variables, set to public for subroutine access
public public public public public public
Java empty queue with no elements and variables, set to public for subroutine access
static static static static static static
int queue[] = new int[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; int frontPointer = 0; int rearPointer = -1; final int queueFull = 10; int queueLength = 0; int item;
▲ Table 19.11 466
457591_19_CI_AS & A_Level_CS_450-497.indd 466
26/04/19 9:14 AM
Language Python global used within subroutine to access variables If the rearPointer is pointing to the last element of the array and the queue is not full, the item is stored in the first element of the array
Sub enQueue(ByVal item) If queueLength < queueFull Then If rearPointer < queue.length - 1 Then rearPointer = rearPointer + 1 Else rearPointer = 0 End If queueLength = queueLength + 1 queue(rearPointer) = item Else Console.WriteLine("Queue is full, cannot enqueue") End If End Sub
VB If the rearPointer is pointing to the last element of the array and the queue is not full, the item is stored in the first element of the array
static void enQueue(int item) { if (queueLength < queueFull) { if (rearPointer < queue.length - 1) rearPointer = rearPointer + 1; else rearPointer = 0; queueLength = queueLength + 1; queue[rearPointer] = item; } else System.out.println("Queue is full, cannot enqueue"); };
Java If the rearPointer is pointing to the last element of the array and the queue is not full, the item is stored in the first element of the array
19 19.1 Algorithms
Queue enqueue (add item to queue) operation def enQueue(item): global queueLength, rearPointer if queueLength < queueFull: if rearPointer < len(queue) - 1: rearPointer = rearPointer + 1 else: rearPointer = 0 queueLength = queueLength + 1 queue[rearPointer] = item else: print("Queue is full, cannot enqueue")
▲ Table 19.12
467
457591_19_CI_AS & A_Level_CS_450-497.indd 467
26/04/19 9:14 AM
19 Computational thinking and problem solving
19
Queue dequeue (remove item from queue) operation def deQueue(): global queueLength, frontPointer, item if queueLength == 0: print("Queue is empty,cannot dequeue") else: item = queue[frontPointer] if frontPointer == len(queue) - 1: frontPointer = 0 else: frontPointer = frontPointer + 1 queueLength = queueLength -1
Language Python If the frontPointer points to the last element in the array and the queue is not empty, the pointer is updated to point at the first item in the array rather than the next item in the array
Sub deQueue() If queueLength = 0 Then Console.WriteLine("Queue is empty, cannot dequeue") Else item = queue(frontPointer) If frontPointer = queue.length - 1 Then frontPointer = 0 Else frontPointer = frontPointer + 1 End if queueLength = queueLength - 1 End If End Sub
VB If the frontPointer points to the last element in the array and the queue is not empty, the pointer is updated to point at the first item in the array rather than the next item in the array
static void deQueue() { if (queueLength == 0) System.out.println("Queue is empty,cannot dequeue"); else { item = queue[frontPointer]; if (frontPointer == queue.length - 1) frontPointer = 0; else frontPointer = frontPointer + 1; queueLength = queueLength - 1; } }
Java If the frontPointer points to the last element in the array and the queue is not empty, the pointer is updated to point at the first item in the array rather than the next item in the array
▲ Table 19.13
468
457591_19_CI_AS & A_Level_CS_450-497.indd 468
26/04/19 9:14 AM
ACTIVITY 19F In your chosen programming language, write a program using subroutines to implement a queue with 10 elements. Test your program by adding two integers 7 and 32 to the queue, removing these integers from the queue, then trying to remove a third integer, and by adding the integers 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 to the queue then trying to add 11 to the queue.
Finding an item in a linked list In Chapter 10, we looked at defining a linked list as an ADT; now we need to consider writing algorithms using a linked list. Here is the declaration algorithm and the identifier table from Chapter 10.
19.1 Algorithms
Linked lists
19
DECLARE mylinkedList ARRAY[0:11] OF INTEGER DECLARE myLinkedListPointers ARRAY[0:11] OF INTEGER DECLARE startPointer : INTEGER DECLARE heapStartPointer : INTEGER DECLARE index : INTEGER heapStartPointer ← 0 startPointer ← -1 // list empty FOR index ← 0 TO 11 myLinkedListPointers[index] ← index + 1 NEXT index // the linked list heap is a linked list of all the spaces in the linked list, this is set up when the linked list is initialised myLinkedListPointers[11] ← -1 // the final heap pointer is set to -1 to show no further links The above code sets up a linked list ready for use. The identifier table is below. Identifier myLinkedList
Description
myLinkedListPointers
Pointers for linked list
startPointer
Start of the linked list
heapStartPointer
Start of the heap
index
Pointer to current element in the linked list
Linked list to be searched
▲ Table 19.14
469
457591_19_CI_AS & A_Level_CS_450-497.indd 469
26/04/19 9:14 AM
Figure 19.5 below shows an empty linked list and its corresponding pointers.
19
myLinkedList myLinkedListPointers heapStartPointer → [0]
1
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
2 3 4 5 6 7 8 9 10 11 −1
▲
19 Computational thinking and problem solving
startPointer = -1
Figure 19.5
Figure 19.6 below shows a populated linked list and its corresponding pointers. myLinkedList myLinkedListPointers [0] [1] [2] [3] startPointer → [4] heapStartPointer → [5] [6] [7] [8] [9] [10] [11]
27 19 36 42 16
−1 0 1 2 3 6 7 8 9 10 11 −1
▲
Figure 19.6
The algorithm to find if an item is in the linked list myLinkedList and return the pointer to the item if found or a null pointer if not found, could be written as a function in pseudocode as shown below. DECLARE itemSearch : INTEGER DECLARE itemPointer : INTEGER CONSTANT nullPointer = -1 FUNCTION find(itemSearch) RETURNS INTEGER DECLARE found : BOOLEAN itemPointer ← startPointer 470
457591_19_CI_AS & A_Level_CS_450-497.indd 470
26/04/19 9:14 AM
found ← FALSE
19
WHILE (itemPointer nullPointer) AND NOT found DO IF myLinkedList[itemPointer] = itemSearch THEN found ← TRUE ELSE
19.1 Algorithms
itemPointer ← myLinkedListPointers[itemPointer] ENDIF ENDWHILE RETURN itemPointer // this function returns the item pointer of the value found or -1 if the item is not found The following programs use a function to search for an item in a populated linked list. Python #Python program for finding an item in a linked list myLinkedList = [27, 19, 36, 42, 16, None, None, None, None, None, None, None] myLinkedListPointers = [-1, 0, 1, 2, 3 ,6 ,7 ,8 ,9 ,10 ,11, -1] startPointer = 4
Populating the linked list
nullPointer = -1 def find(itemSearch): found = False itemPointer = startPointer while itemPointer != nullPointer and not found: if myLinkedList[itemPointer] == itemSearch: found = True
Defining the find function
else: itemPointer = myLinkedListPointers[itemPointer] return itemPointer #enter item to search for item = int(input("Please enter item to be found ")) result = find(item)
Calling the find function
if result != -1: print("Item found") else: print("Item not found")
471
457591_19_CI_AS & A_Level_CS_450-497.indd 471
26/04/19 9:14 AM
19
VB 'VB program for finding an item in a linked list Module Module1 Public Dim startPointer As Integer = 4 Public Const nullPointer As Integer = -1
19 Computational thinking and problem solving
Public Dim item As Integer Public Dim itemPointer As Integer Public Dim result As Integer Public Dim myLinkedList() As Integer = {27, 19, 36, 42, 16, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}
Populating the linked list
Public Dim myLinkedListPointers() As Integer = {-1, 0, 1, 2, 3, 6, 7, 8, 9, 10, 11, -1} Public Sub Main() 'enter item to search for Console.Write("Please enter item to be found ") item = Integer.Parse(Console.ReadLine()) result = find(item)
Calling the find function
If result -1 Then Console.WriteLine("Item found") Else Console.WriteLine("Item not found") End If Console.ReadKey() End Sub
Function find(ByVal itemSearch As Integer) As Integer Dim found As Boolean = False itemPointer = startPointer While (itemPointer nullPointer) And Not found If itemSearch = myLinkedList(itemPointer) Then found = True Else
Defining the find function
itemPointer = myLinkedListPointers(itemPointer) End If End While Return itemPointer End Function End Module
472
457591_19_CI_AS & A_Level_CS_450-497.indd 472
26/04/19 9:14 AM
Java
19 Populating the linked list
19.1 Algorithms
//Java program for finding an item in a linked list import java.util.Scanner; class LinkedListAll { public static int myLinkedList[] = new int[] {27, 19, 36, 42, 16, 0, 0, 0, 0, 0, 0, 0}; public static int myLinkedListPointers[] = new int[] {-1, 0, 1, 2, 3, 6, 7, 8, 9, 10, 11, -1}; public static int startPointer = 4; public static final int nullPointer = -1; static int find(int itemSearch) { boolean found = false; int itemPointer = startPointer; do { if (itemSearch == myLinkedList[itemPointer]) { found = true; } else { itemPointer = myLinkedListPointers[itemPointer]; } } while ((itemPointer != nullPointer) && !found); return itemPointer;
Defining the find function
} public static void main(String args[]) { Scanner input = new Scanner(System.in); System.out.println("Please enter item to be found "); int item = input.nextInt(); int result = find(item); Calling the find function if (result != -1 ) { System.out.println("Item found"); } else { System.out.println("Item not found"); } } }
473
457591_19_CI_AS & A_Level_CS_450-497.indd 473
26/04/19 9:14 AM
19 Computational thinking and problem solving
19
ACTIVITY 19G In the programming language of your choice, use the code given to write a program to set up the populated linked list and find an item stored in it.
The trace table below shows the algorithm being used to search for 42 in myLinkedList. startPointer
itemPointer
searchItem
Already set to 4
4
42
3 ▲ Table 19.15 Trace table
Inserting items into a linked list The algorithm to insert an item in the linked list myLinkedList could be written as a procedure in pseudocode as shown below.
DECLARE itemAdd : INTEGER DECLARE startPointer : INTEGER DECLARE heapstartPointer : INTEGER DECLARE tempPointer : INTEGER CONSTANT nullPointer = -1 PROCEDURE linkedListAdd(itemAdd) // check for list full IF heapStartPointer = nullPointer THEN OUTPUT "Linked list full" ELSE // get next place in list from the heap tempPointer ← startPointer // keep old start pointer startPointer ← heapStartPointer
// set start pointer to next position in heap
heapStartPointer ← myLinkedListPointers[heapStartPointer] // reset heap start pointer myLinkedList[startPointer] ← itemAdd // put item in list myLinkedListPointers[startPointer] ← tempPointer
// update linked list pointer
ENDIF ENDPROCEDURE
Here is the identifier table. Identifier startPointer
Description Start of the linked list
heapStartPointer
Start of the heap
nullPointer
Null pointer set to -1
itemAdd
Item to add to the list
tempPointer
Temporary pointer
▲ Table 19.16
474
457591_19_CI_AS & A_Level_CS_450-497.indd 474
26/04/19 9:14 AM
Figure 19.7 below shows the populated linked list and its corresponding pointers again. myLinkedListPointers
27 19 36 42 16
−1 0 1 2 3 6 7 8 9 10 11 −1
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
startPointer → heapStartPointer →
19 19.1 Algorithms
myLinkedList
▲ Figure 19.7
The trace table below shows the algorithm being used to add 18 to myLinkedList. startPointer
heapStartPointer
itemAdd
Already set to 4
Already set to 5
18
5
6
tempPointer 4
▲ Table 19.17 Trace table
The linked list, myLinkedList, will now be as shown below.
startPointer → heapStartPointer →
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
myLinkedList
myLinkedListPointers
27 19 36 42 16 18
−1 0 1 2 3 4 7 8 9 10 11 −1
▲ Figure 19.8
475
457591_19_CI_AS & A_Level_CS_450-497.indd 475
26/04/19 9:14 AM
19
The following procedure adds an item to a linked list. Python def insert(itemAdd): global startPointer if heapStartPointer == nullPointer:
19 Computational thinking and problem solving
print("Linked List full") else: tempPointer = startPointer
Adjusting the pointers and adding the item
startPointer = heapStartPointer heapStartPointer = myLinkedListPointers[heapStartPointer] myLinkedList[startPointer] = itemAdd myLinkedListPointers[startPointer] = tempPointer
VB Sub insert (ByVal itemAdd) Dim tempPointer As Integer If heapStartPointer = nullPointer Then Console.WriteLine("Linked List full") Else tempPointer = startPointer
Adjusting the pointers and adding the item
startPointer = heapStartPointer heapStartPointer = myLinkedListPointers(heapStartPointer) myLinkedList(startPointer) = itemAdd myLinkedListPointers(startPointer) = tempPointer End if End Sub
Java static void insert(int itemAdd) { if (heapStartPointer == nullPointer) System.out.println("Linked List is full"); else { int tempPointer = startPointer; startPointer = heapStartPointer; heapStartPointer = myLinkedListPointers[heapStartPointer]; myLinkedList[startPointer] = itemAdd; myLinkedListPointers[startPointer] = tempPointer;
}
}
476
457591_19_CI_AS & A_Level_CS_450-497.indd 476
26/04/19 9:14 AM
ACTIVITY 19H Use the algorithm to add 25 to myLinkedList. Show this in a trace table and show myLinkedList once 25 has been added. Add the insert procedure to your program, add code to input an item, add this item to the linked list then print out the list and the pointers before and after the item was added.
DECLARE itemDelete : INTEGER DECLARE oldIndex : INTEGER DECLARE index : INTEGER DECLARE startPointer : INTEGER DECLARE heapStartPointer : INTEGER DECLARE tempPointer : INTEGER CONSTANT nullPointer = -1 PROCEDURE linkedListDelete(itemDelete) // check for list empty IF startPointer = nullPointer THEN OUTPUT "Linked list empty" ELSE // find item to delete in linked list index ← startPointer WHILE myLinkedList[index] itemDelete AND (index nullPointer) DO oldIndex ← index index ← myLinkedListPointers[index] ENDWHILE IF index = nullPointer THEN OUTPUT "Item ", itemDelete, " not found" ELSE // delete the pointer and the item tempPointer ← myLinkedListPointers[index] myLinkedListPointers[index] ← heapStartPointer heapStartPointer ← index myLinkedListPointers[oldIndex] ← tempPointer ENDIF ENDIF ENDPROCEDURE
19.1 Algorithms
Deleting items from a linked list The algorithm to delete an item from the linked list myLinkedList could be written as a procedure in pseudocode as shown below.
19
477
457591_19_CI_AS & A_Level_CS_450-497.indd 477
26/04/19 9:14 AM
Here is the identifier table.
19 Computational thinking and problem solving
19
Identifier startPointer
Description Start of the linked list
heapStartPointer
Start of the heap
nullPointer
Null pointer set to −1
index
Pointer to current list element
oldIndex
Pointer to previous list element
itemDelete
Item to delete from the list
tempPointer
Temporary pointer
▲ Figure 19.18
The trace table below shows the algorithm being used to delete 36 from myLinkedList. startPointer
heapStartPointer
itemDelete
index
oldIndex
Already set to 4
Already set to 5
36
4
4
3
3
tempPointer
2 2
1
▲ Table 19.19 Trace table
The linked list, myLinkedList, will now be as follows.
heapStartPointer →
startPointer →
myLinkedList
myLinkedListPointers
27 19 36 42 16 18
−1 0 6 1 3 4 7 8 9 10 11 −1
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
updated pointers
▲ Figure 19.9
478
457591_19_CI_AS & A_Level_CS_450-497.indd 478
26/04/19 9:14 AM
The following procedure deletes an item from a linked list. Python def delete(itemDelete):
19
global startPointer, heapStartPointer if startPointer == nullPointer: print("Linked List empty") index = startPointer while myLinkedList[index] != itemDelete and index != nullPointer: oldindex = index index = myLinkedListPointers[index]
19.1 Algorithms
else:
if index == nullPointer: print("Item ", itemDelete, " not found") else: myLinkedList[index] = None tempPointer = myLinkedListPointers[index] myLinkedListPointers[index] = heapStartPointer heapStartPointer = index myLinkedListPointers[oldindex] = tempPointer
VB Sub delete (ByVal itemDelete) Dim tempPointer, index, oldIndex
As Integer
If startPointer = nullPointer Then Console.WriteLine("Linked List empty") Else index = startPointer While myLinkedList(index) itemDelete And index nullPointer Console.WriteLine( myLinkedList(index) & " " & index) Console.ReadKey() oldIndex = index index = myLinkedListPointers(index) End While if index = nullPointer Then Console.WriteLine("Item " & itemDelete & " not found")
479
457591_19_CI_AS & A_Level_CS_450-497.indd 479
26/04/19 9:14 AM
Else
19
myLinkedList(index) = nothing tempPointer = myLinkedListPointers(index) myLinkedListPointers(index) = heapStartPointer heapStartPointer = index
19 Computational thinking and problem solving
myLinkedListPointers(oldIndex) = tempPointer End If End If End Sub
Java static void delete(int itemDelete) { int oldIndex = -1; if (startPointer == nullPointer) System.out.println("Linked List is empty"); else { int index = startPointer; while (myLinkedList[index] != itemDelete && index != nullPointer) { oldIndex = index; index = myLinkedListPointers[index]; } if (index == nullPointer) System.out.println("Item " + itemDelete + " not found"); else { myLinkedList[index] = 0; int tempPointer = myLinkedListPointers[index]; myLinkedListPointers[index] = heapStartPointer; heapStartPointer = index; myLinkedListPointers[oldIndex] = tempPointer; } } }
480
457591_19_CI_AS & A_Level_CS_450-497.indd 480
26/04/19 9:14 AM
ACTIVITY 19I Use the algorithm to remove 16 from myLinkedList. Show this in a trace table and show myLinkedList once 16 has been removed. Add the delete procedure to your program, add code to input an item, delete this item to the linked list, then print out the list and the pointers before and after the item was deleted.
19.1 Algorithms
Binary trees A binary tree is another frequently used ADT. It is a hierarchical data structure in which each parent node can have a maximum of two child nodes. There are many uses for binary trees; for example, they are used in syntax analysis, compression algorithms and 3D video games.
19
Figure 19.10 shows the binary tree for the data stored in myList sorted in ascending order. Each item is stored at a node and each node can have up to two branches with the rule if the value to be added is less than the current node branch left, if the value to be added is greater than or equal to the current node branch right. Root node
Left pointer
Right pointer 27
Left subtree
19
Right subtree
36 21
16
42
17
89 55
Leaf node
▲ Figure 19.10 Example of an ordered binary tree
ACTIVITY 19J
A binary tree can also be used to represent an arithmetic expression. Consider (a + b) * (c – a)
Draw the binary tree for the expression (x – y) / ( x * y + z).
*
+
a
–
b
c
a
▲ Figure 19.11 Example of an expression as a binary tree
EXTENSION ACTIVITY 19B Find out about different tree traversals and how they are used to convert an expression into reverse Polish.
481
457591_19_CI_AS & A_Level_CS_450-497.indd 481
26/04/19 9:14 AM
The data structure for an ordered binary tree can be created in pseudocode as follows:
19
TYPE node
DECLARE item : INTEGER
DECLARE leftPointer : INTEGER
19 Computational thinking and problem solving
DECLARE rightPointer : INTEGER
ENDTYPE
DECLARE myTree[0 : 8] OF node
DECLARE rootPointer : INTEGER
DECLARE nextFreePointer : INTEGER
ACTIVITY 19K Create the data structure in pseudocode for a binary tree to store a list of names. Your list must be able to store at least 50 names.
The populated contents of the data structure myTree is shown below.
Root pointer
myTree
item
leftPointer
rightPointer
[0]
27
1
2
[1]
19
4
6
[2]
36
−1
3
[3]
42
−1
5
[4]
16
−1
7
[5]
89
8
−1
[6]
21
−1
−1
[7]
17
−1
−1
[8]
55
−1
−1
Pointers to items in the tree. −1 is used as a null pointer
▲ Figure 19.12
The root pointer points to the first node in a binary tree. A null pointer is a value stored in the left or right pointer in a binary tree to indicate that there are no nodes below this node on the left or right. Finding an item in a binary tree The algorithm to find if an item is in the binary tree myTree and return the pointer to its node if found or a null pointer if not found, could be written as a function in pseudocode, as shown.
482
457591_19_CI_AS & A_Level_CS_450-497.indd 482
26/04/19 9:14 AM
DECLARE rootPointer : INTEGER DECLARE itemPointer : INTEGER DECLARE itemSearch : INTEGER
19
CONSTANT nullPointer = -1 rootPointer ← 0 FUNCTION find(itemSearch) RETURNS INTEGER WHILE myTree[itemPointer].item itemSearch AND
(itemPointer nullPointer) DO
IF myTree[itemPointer].item > itemSearch
19.1 Algorithms
itemPointer ← rootPointer
THEN
itemPointer ← myTree[itemPointer].leftPointer
ELSE
itemPointer ← myTree[itemPointer].rightPointer
ENDIF
ENDWHILE RETURN itemPointer Here is the identifier table for the binary tree search algorithm shown above. Identifier myTree
Description Tree to be searched
node
ADT for tree
rootPointer
Pointer to the start of the tree
leftPointer
Pointer to the left branch
rightPointer
Pointer to the right branch
nullPointer
Null pointer set to −1
itemPointer
Pointer to current item
itemSearch
Item being searched for
▲ Table 19.20
ACTIVITY 19L Use the algorithm to search for 55 and 75 in myTree. Show the results of each search in a trace table.
The trace table below shows the algorithm being used to search for 42 in myTree. rootPointer
itemPointer
itemSearch
42
2 3 ▲ Table 19.21 Trace table
483
457591_19_CI_AS & A_Level_CS_450-497.indd 483
26/04/19 9:14 AM
Inserting items into a binary tree The binary tree needs free nodes to add new items. For example, myTree, shown in Figure 19.13 below, now has room for 12 items. The last three nodes have not been filled yet, there is a pointer to the next free node and the free nodes are set up like a heap in a linked list, using the left pointer.
19 19 Computational thinking and problem solving
Root pointer
next free pointer
myTree
item
leftPointer
rightPointer
[0]
27
1
2
[1]
19
4
6
[2]
36
−1
3
[3]
42
−1
5
[4]
16
−1
7
[5]
89
8
−1
[6]
21
−1
−1
[7]
17
−1
−1
[8]
55
−1
−1
[9]
10
[10]
11
[11]
−1
pointers to items in the tree. −1 is used as a null pointer
Leaves have null left and right pointers
▲ Figure 19.13
The algorithm to insert an item at a new node in the binary tree myTree could be written as a procedure in pseudocode as shown below. TYPE node DECLARE item : INTEGER DECLARE leftPointer : INTEGER DECLARE rightPointer : INTEGER DECLARE oldPointer : INTEGER DECLARE leftBranch : BOOLEAN ENDTYPE DECLARE myTree[0 : 11] OF node // binary tree now has extra spaces DECLARE rootPointer : INTEGER DECLARE nextFreePointer : INTEGER DECLARE itemPointer : INTEGER DECLARE itemAdd : INTEGER DECLARE itemAddPointer : Integer CONSTANT nullPointer = -1 // needed to use the binary tree PROCEDURE nodeAdd(itemAdd) // check for full tree IF nextFreePointer = nullPointer THEN OUTPUT "No nodes free"
484
457591_19_CI_AS & A_Level_CS_450-497.indd 484
26/04/19 9:14 AM
ELSE //use next free node itemAddPointer ← nextFreePointer
19
nextFreePointer ← myTree[nextFreePointer].leftPointer itemPointer ← rootPointer // check for empty tree THEN rootPointer ← itemAddPointer ELSE
19.1 Algorithms
IF itemPointer = nullPointer
// find where to insert a new leaf WHILE (itemPointer nullPointer) DO oldPointer ← itemPointer IF myTree[itemPointer].item > itemAdd THEN
// choose left branch
leftBranch ← TRUE itemPointer ← myTree[itemPointer].leftPointer ELSE
// choose right branch
leftBranch ← FALSE itemPointer ← myTree[itemPointer].rightPointer ENDIF ENDWHILE IF leftBranch
//use left or right branch
THEN myTree[oldPointer].leftPointer ← itemAddPointer ELSE myTree[oldPointer].rightPointer ← itemAddPointer ENDIF ENDIF // store item to be added in the new node myTree[itemAddPointer].leftPointer ← nullPointer myTree[itemAddPointer].rightPointer ← nullPointer myTree[itemAddPointer].item ← itemAdd ENDIF ENDPROCEDURE
485
457591_19_CI_AS & A_Level_CS_450-497.indd 485
26/04/19 9:14 AM
Here is the identifier table.
19 Computational thinking and problem solving
19
Identifier myTree
Description Tree to be searched
node
ADT for tree
rootPointer
Pointer to the start of the tree
leftPointer
Pointer to the left branch
rightPointer
Pointer to the right branch
nullPointer
Null pointer set to -1
itemPointer
Pointer to current item in tree
itemAdd
Item to add to tree
nextFreePointer
Pointer to next free node
itemAddPointer
Pointer to position in tree to store item to be added
oldPointer
Pointer to leaf node that is going to point to item added
leftBranch
Flag to identify whether to go down the left branch or the right branch
▲ Table 19.22
The trace table below shows the algorithm being used to add 18 to myTree. leftBranch nextFreePointer itemAddPointer Already set to 9
9
rootPointer itemAdd itemPointer oldPointer Already set to 0
18
10
TRUE
1
1
TRUE
4
4
FALSE
7
7
−1 ▲ Table 19.23
The tree, myTree will now be as shown below.
next free pointer now 10
myTree
item
leftPointer
rightPointer
[0]
27
1
2
[1]
19
4
6
[2]
36
−1
3
[3]
42
−1
5
[4]
16
−1
7
[5]
89
8
−1
[6]
21
−1
−1
[7]
17
−1
9
[8]
55
−1
−1
[9]
18
−1
−1
[10]
11
[11]
−1
pointer to new node in correct position new leaf node
▲ Figure 19.14 486
457591_19_CI_AS & A_Level_CS_450-497.indd 486
26/04/19 9:14 AM
ACTIVITY 19M Use the algorithm to add 25 to myTree. Show this in a trace table and show myTree once 25 has been added.
19
Implementing binary trees in Python, VB.NET or Java requires the use of objects and recursion. An example will be given in Chapter 20. 19.1 Algorithms
Graphs A graph is a non-linear data structure consisting of nodes and edges. This is an ADT used to implement directed and undirected graphs. A graph consists of a set of nodes and edges that join a pair of nodes. If the edges have a direction from one node to the other it is a directed graph. Nodes Edges
Undirected graph
Directed graph
▲ Figure 19.15
As we saw in Chapter 18, graphs are used to represent real life networks, such as » bus routes, where the nodes are bus stops and the edges connect two stops next to each other » websites, where each web page is a node and the edges show the links between each web page » social media networks, where each node contains information about a person and the edges connect people who are friends.
Each edge may have a weight; for example, in the bus route, the weight could be the distance between bus stops or the cost of the bus fare. A path is the list of nodes connected by edges between two given nodes and a cycle is a list of nodes that return to the same node. For example, a graph of the bus routes in a town could be as follows. The distance between each bus stop in kilometres is shown on the graph. 0.5
Town centre
School 0.9
0.5
ACTIVITY 19N Find another path from School to Gardens. Find the shortest path from Town centre to Train station. Find the shortest cycle from the Town centre.
Shopping centre
1.5
Train station
2.5
2.0
0.5
1.2 Gardens
0.7
River
▲ Figure 19.16
A path from School to Gardens could be Path = (School, Train station, River, Gardens). 487
457591_19_CI_AS & A_Level_CS_450-497.indd 487
26/04/19 9:14 AM
19
19.1.4 Implementing one ADT from another ADT Every ADT is a collection of data and the methods used on the data. When an ADT is defined, the definition can refer to other data types. For example, myLinkedList refers to the data type INTEGER in its data definition. A linked list type could be defined as follows.
19 Computational thinking and problem solving
TYPE linkedList DECLARE item : INTEGER DECLARE Pointer : INTEGER ENDTYPE // a linked list to store integers And then used as follows. DECLARE myLinkedList : ARRAY [0:11] OF linkedList DECLARE heapStartPointer : INTEGER DECLARE startPointer : INTEGER DECLARE index : INTEGER
ACTIVITY 19O Write pseudocode to declare a linked list to store names. Use this to write pseudocode to set up a linked list that will store 30 names. Write a program to store and display names in this linked list.
The data types for a stack, queue and a binary tree have been defined using existing data types. Another data type is a dictionary, which is an ADT that consists of pairs consisting of a key and a value, where the key is used to find the value. Each key can only appear once. Keys in a dictionary are unordered. A value is retrieved from a dictionary by specifying its corresponding key. The same value may appear more than once. A dictionary differs from a set because the values can be duplicated. As a dictionary is not an ordered list, it can be declared using a linked list as part of the definition. A dictionary type could be defined in pseudocode as follows. TYPE linkedList DECLARE item : STRING DECLARE pointer : INTEGER ENDTYPE TYPE dictionary ECLARE key : myLinkedList : ARRAY [0:19] OF D linkedList DECLARE value : ARRAY [0:19] OF STRING ENDTYPE 488
457591_19_CI_AS & A_Level_CS_450-497.indd 488
26/04/19 9:14 AM
And then used as follows. DECLARE DECLARE DECLARE DECLARE
19
myDictionary : linkedList heapStartPointer : INTEGER startPointer : INTEGER index : INTEGER
ACTIVITY 19P In the programming language of your choice, write a program to use a dictionary to store the names of students as their keys and their examination scores as their values. Then find a student's examination score, add a student and score and delete a student and score.
Dictionary data type example
Language
studentdict = { "Leon": 27, "Ahmad": 78, "Susie": 64 }
Python
Dim studentdict As New Dictionary(Of String, Integer) studentdict.Add("Leon", 27) studentdict.Add("Ahmad", 78) studentdict.Add("Susie", 64)
VB
studentdict = dict([("Leon", 27), ("Ahmad", 78), ("Susie", 64)]) Or Dictionary studentdict = new Hashtable(); studentdict.put(27,"Leon"); studentdict.put(78,"Ahmad"); studentdict.put(64,"Susie");
Java Dictionary is no longer used in Java but can be implemented using a hash table
19.1 Algorithms
Each of the programming languages used in Cambridge International A Level Computer Science provide a dictionary data type, as shown in the table below.
▲ Table 19.24
19.1.5 Comparing algorithms Big O notation is a mathematical notation used to describe the performance or complexity of an algorithm in relation to the time taken or the memory used for the task. It is used to describe the worst-case scenario; for example, how the maximum number of comparisons required to find a value in a list using a particular search algorithm increases with the number of values in the list.
Big O order of time complexity Description
Example
O(1)
describes an algorithm that always takes the same time to perform the task deciding if a number is even or odd
O(N)
describes an algorithm where the time to perform the task will grow linearly in direct proportion to N, the number of items of data the algorithm is using
a linear search
O(N2)
describes an algorithm where the time to perform the task will grow linearly in direct proportion to the square of N, the number of items of data the algorithm is using
bubble sort, insertion sort
O(2N)
describes an algorithm where the time to perform the task doubles every time the algorithm uses an extra item of data
calculation of Fibonacci numbers using recursion (see Section 19.2)
O(Log N)
describes an algorithm where the time to perform the task goes up linearly as the number of items goes up exponentially
binary search
▲ Table 19.25 Big O order of time complexity 489
457591_19_CI_AS & A_Level_CS_450-497.indd 489
26/04/19 9:14 AM
Big O order of space complexity
19 Computational thinking and problem solving
19
Description
Example
O(1)
describes an algorithm that always uses the same space to perform the task
any algorithm that just uses variables, for example d=a+b+c
O(N)
describes an algorithm where the space to perform the task will grow linearly in direct proportion to N, the number of items of data the algorithm is using
any algorithm that uses arrays, for example a loop to calculate a running total of values input to an array of N elements
▲ Table 19.26 Big O order of space complexity
ACTIVITY 19Q 1 Using diagrams, describe the structure of a) a binary tree b) a linked list. 2 a) Explain what is meant by a dictionary data type. b) Show how a dictionary data type can be constructed from a linked list. 3 Compare the performance of a linear search and a binary search using Big O notation.
Key terms
19.2 Recursion
Recursion – a process using a function or procedure that is defined in terms of itself and calls itself.
WHAT YOU SHOULD ALREADY KNOW Remind yourself of the definitions of the following mathematical functions, which many of you will be familiar with, and see how they are constructed. ■ Factorials ■ Arithmetic sequences ■ Fibonacci numbers ■ Compound interest
Base case – a terminating solution to a process that is not recursive. General case – a solution to a process that is recursively defined. Winding – process which occurs when a recursive function or procedure is called until the base case is found. Unwinding – process which occurs when a recursive function finds the base case and the function returns the values.
19.2.1 Understanding recursion Recursion is a process using a function or procedure that is defined in terms of itself and calls itself. The process is defined using a base case, a terminating solution to a process that is not recursive, and a general case, a solution to a process that is recursively defined. For example, a function to calculate a factorial for any positive whole number n! is recursive. The definition for the function uses:
a base case of 0! = 1
a general case of n! = n * (n–1)!
490
457591_19_CI_AS & A_Level_CS_450-497.indd 490
26/04/19 9:14 AM
This can be written in pseudocode as a recursive function.
19 19.2 Recursion
FUNCTION factorial (number : INTEGER) RETURNS INTEGER IF number = 0 THEN answer ← 1 // base case ELSE answer ← number * factorial (number - 1) // recursive call with general case ENDIF RETURN answer ENDFUNCTION With recursive functions, the statements after the recursive function call are not executed until the base case is reached; this is called winding. After the base case is reached and can be used in the recursive process, the function is unwinding. In order to understand how the winding and unwinding processes in recursion work, we can use a trace table for a specific example: 3! Call number 1
Function call Factorial (3)
number 3
answer 3 * factorial (2)
RETURN
2
Factorial (2)
2
2 * factorial (1)
3
Factorial (1)
1
1 * factorial (0)
4
Factorial (0)
1
1
3 continued
Factorial (1)
1
1 * 1
1
2 continued
Factorial (2)
2
2 * 1
2
1 continued
Factorial (3)
3
3 * 2
6
} }
winding base case unwinding
▲ Table 19.27
Here is a simple recursive factorial program written in Python, VB and Java using a function. Python #Python program recursive factorial function def factorial(number): if number == 0: answer = 1 else: answer = number * factorial(number - 1) return answer print(factorial(0)) print(factorial(5))
491
457591_19_CI_AS & A_Level_CS_450-497.indd 491
26/04/19 9:14 AM
19
VB 'VB program recursive factorial function Module Module1 Sub Main() Console.WriteLine(factorial(0)) Console.Writeline(factorial(5))
19 Computational thinking and problem solving
Console.ReadKey() End Sub Function factorial(ByVal number As Integer) As Integer Dim answer As Integer If number = 0 Then answer = 1 Else answer = number * factorial(number - 1) End If return answer End Function End Module
Java // Java program recursive factorial function public class Factorial { public static void main(String[] args) { System.out.println(factorial(0)); System.out.println(factorial(5)); } public static int factorial(int number) { int answer; if (number == 0) answer = 1; else answer = number * factorial(number - 1); return answer; } }
ACTIVITY 19R Write the recursive factorial function in the programming language of your choice. Test your program with 0! and 5! Complete trace tables for 0! and 5! using the recursive factorial function written in pseudocode and compare the results from your program with the trace tables. 492
457591_19_CI_AS & A_Level_CS_450-497.indd 492
26/04/19 9:14 AM
Compound interest can be calculated using a recursive function. Where the principal is the amount of money invested, rate is the rate of interest and years is the number of years the money has been invested. The base case is The general case is
19
total0 = principal where years = 0 totaln = totaln-1 * rate
▲ Table 19.28
19.2 Recursion
DEFINE FUNCTION compoundInt(principal, rate, years : REAL) RETURNS REAL IF years = 0 THEN total ← principal ELSE total ← compoundInt(principal * rate, rate, years - 1) ENDIF RETURN total ENDFUNCTION This function can be traced for a principal of 100 over three years at 1.05 (5% interest). Call number 1
Function call compoundInt(100, 1.05, 3)
years 3
total compoundInt(105, 1.05, 2)
RETURN
2
compoundInt(105, 1.05, 2)
2
compoundInt(105, 1.05, 1)
3
compoundInt(105, 1.05, 1)
1
compoundInt(105, 1.05, 0)
4
compoundInt(105, 1.05, 0)
100
100
3 cont
compoundInt(105, 1.05, 1)
1
105
105
2 cont
compoundInt(105, 1.05, 2)
2
110.25
110.25
1 cont
compoundInt(105, 1.05, 3)
3
115.76
115.76
▲ Table 19.29
EXTENSION ACTIVITY 19C Write your function from Activity 19S in the high-level programming language of your choice. Test this with the 5th and 27th terms.
ACTIVITY 19S The Fibonacci series is defined as a sequence of numbers in which the first two numbers are 0 and 1, depending on the selected beginning point of the sequence, and each subsequent number is the sum of the previous two. Identify the base case and the general case for this series. Write a pseudocode algorithm to find and output the nth term. Test your algorithm by drawing a trace table for the fourth term.
Benefits of recursion Recursive solutions can contain fewer programming statements than an iterative solution. The solutions can solve complex problems in a simpler way than an iterative solution. However, if recursive calls to procedures and functions are very repetitive, there is a very heavy use of the stack, which can lead to stack overflow. For example, factorial(100) would require 100 function calls to be placed on the stack before the function unwinds. 493
457591_19_CI_AS & A_Level_CS_450-497.indd 493
26/04/19 9:14 AM
19.2.2 How a compiler implements recursion
19 Computational thinking and problem solving
19
Recursive code needs to make use of the stack; therefore, in order to implement recursive procedures and functions in a high-level programming language, a compiler must produce object code that pushes return addresses and values of local variables onto the stack with each recursive call, winding. The object code then pops the return addresses and values of local variables off the stack, unwinding.
ACTIVITY 19T 1 Explain what is meant by recursion and give the benefits of using recursion in programming. 2 Explain why a compiler needs to produce object code that uses the stack for a recursive procedure.
End of chapter questions
1 Data is stored in the array NameList[1:10]. This data is to be sorted. a) i) Copy and complete this pseudocode algorithm for an insertion sort.[7] FOR ThisPointer ← 2 TO ............................................. / use a temporary variable to store item which is to / // be inserted into its correct location Temp ← NameList[ThisPointer] Pointer ← ThisPointer – 1 WHILE (NameList[Pointer] > Temp) AND ..................... // move list item to next location ameList[......................] ← NameList[.....................] N Pointer ← ................................................. ENDWHILE // insert value of Temp in correct location NameList[....................................] ← .......................... ENDFOR
ii) A special case is when NameList is already in order. The algorithm in part a) i) is applied to this special case. Explain how many iterations are carried out for each of the loops. [3] b) An alternative sort algorithm is a bubble sort: FOR ThisPointer ← 1 TO 9 FOR Pointer ← 1 TO 9 IF NameList[Pointer] > NameList[Pointer + 1] THEN Temp ← NameList[Pointer] NameList[Pointer] ← NameList[Pointer + 1] NameList[Pointer + 1] ← Temp ENDIF ENDFOR ENDFOR 494
457591_19_CI_AS & A_Level_CS_450-497.indd 494
26/04/19 9:14 AM
19 19.2 Recursion
i) As in part a) ii), a special case is when NameList is already in order. The algorithm in part b) is applied to this special case. Explain how many iterations are carried out for each of the loops. [2] ii) Rewrite the algorithm in part b), using pseudocode, to reduce the number of unnecessary comparisons. Use the same variable names where appropriate. [5] Cambridge International AS & A Level Computer Science 9608 Paper 41 Q5 June 2015 2 A Queue Abstract Data type (ADT) has these associated operations: – create queue – add item to queue – remove item from queue The queue ADT is to be implemented as a linked list of nodes. Each node consists of data and a pointer to the next node. a) The following operations are carried out: CreateQueue AddName("Ali") AddName("Jack") AddName("Ben") AddName("Ahmed") RemoveName AddName("Jatinder") RemoveName
Copy the diagram and add appropriate labels to show the final state of the queue. Use the space on the left as a workspace. Show your final answer in the node shapes on the right. [3]
b) Using pseudocode, a record type, Node, is declared as follows: TYPE Node DECLARE Name : STRING DECLARE Pointer : INTEGER ENDTYPE
The statement
DECLARE Queue : ARRAY[1:10] OF Node
reserves space for 10 nodes in array Queue.
➔ 495
457591_19_CI_AS & A_Level_CS_450-497.indd 495
4/30/19 8:02 AM
19
i) The CreateQueue operation links all nodes and initialises the three pointers that need to be used: HeadPointer, TailPointer and FreePointer. Copy and complete the diagram to show the value of all pointers after CreateQueue has been executed. Queue HeadPointer
Name
[4]
Pointer
19 Computational thinking and problem solving
[1] [2] TailPointer
[3] [4] [5]
FreePointer
[6] [7] [8] [9] [10]
ii) The algorithm for adding a name to the queue is written, using pseudocode, as a procedure with the header:
PROCEDURE AddName(NewName)
where NewName is the new name to be added to the queue. The procedure uses the variables as shown in the identifier table. Identifier Queue
Data type Array[1:10] OF Node
Description Array to store node data
NewName
STRING
Name to be added
FreePointer
INTEGER
Pointer to next free node in array
HeadPointer
INTEGER
Pointer to first node in queue
TailPointer
INTEGER
Pointer to last node in queue
CurrentPointer
INTEGER
Pointer to current node
PROCEDURE AddName(BYVALUE NewName : STRING) // Report error if no free nodes remaining IF FreePointer = 0 THEN Report Error ELSE // new name placed in node at head of free list CurrentPointer ← FreePointer Queue[CurrentPointer].Name ← NewName // adjust free pointer
496
457591_19_CI_AS & A_Level_CS_450-497.indd 496
26/04/19 9:14 AM
19 19.2 Recursion
reePointer ← Queue[CurrentPointer]. F Pointer // if first name in queue then adjust head pointer IF HeadPointer = 0 THEN HeadPointer ← CurrentPointer ENDIF // current node is new end of queue Queue[CurrentPointer].Pointer ← 0 TailPointer ← CurrentPointer ENDIF ENDPROCEDURE Copy and complete the pseudocode for the procedure RemoveName. Use the variables listed in the identifier table. [6] PROCEDURE RemoveName() // Report error if Queue is empty ............................................................................. ............................................................................. ............................................................................. ............................................................................. OUTPUT Queue[………………………………………………].Name // current node
is head of queue
............................................................................. // update head pointer ............................................................................. // i f only one element in queue then update tail pointer ............................................................................. ............................................................................. ............................................................................. ............................................................................. // link released node to free list ............................................................................. ............................................................................. ............................................................................. ENDPROCEDURE Cambridge International AS & A Level Computer Science 9608 Paper 41 Q6 June 2015 497
457591_19_CI_AS & A_Level_CS_450-497.indd 497
26/04/19 9:14 AM
20
20 Further programming
Further programming In this chapter, you will learn about ★
the characteristics of a number of programming paradigms, including low-level programming, imperative (procedural) programming, object-oriented programming and declarative programming ★ how to write code to perform file-processing operations on serial, sequential and random files ★ exceptions and the importance of exception handling.
20.1 Programming paradigms WHAT YOU SHOULD ALREADY KNOW In Chapter 4, Section 4.2, you learnt about assembly language, and in Chapter 11, Section 11.3, you learnt about structured programming. Review these sections then try these three questions before you read the first part of this chapter. 1 Describe four modes of addressing in assembly language. 2 Write an assembly language program to add the numbers 7 and 5 together and store the result in the accumulator.
3 a) Explain the difference between a procedure and a function. b) Describe how to pass parameters. c) Describe the difference between a procedure definition and a procedure call. 4 Write a short program that uses a procedure. Throughout this section, you will be prompted to refer to previous chapters to review related content.
Key terms Programming paradigm – a set of programming concepts.
Method – a programmed procedure that is defined as part of a class.
Low-level programming – programming instructions that use the computer’s basic instruction set.
Encapsulation – process of putting data and methods together as a single unit, a class.
Imperative programming – programming paradigm in which the steps required to execute a program are set out in the order they need to be carried out.
Object – an instance of a class that is self-contained and includes data and methods.
Object-oriented programming (OOP) – a programming methodology that uses self-contained objects, which contain programming statements (methods) and data, and which communicate with each other. Class – a template defining the methods and data of a certain type of object. Attributes (class) – the data items in a class.
Property – data and methods within an object that perform a named action. Instance – An occurrence of an object during the execution of a program. Data hiding – technique which protects the integrity of an object by restricting access to the data and methods within that object.
498
457591_20_CI_AS & A_Level_CS_498-540.indd 498
26/04/19 9:05 AM
Inheritance – process in which the methods and data from one class, a superclass or base class, are copied to another class, a derived class. Polymorphism – feature of object-oriented programming that allows methods to be redefined for derived classes. Overloading – feature of object-oriented programming that allows a method to be defined more than once in a class, so it can be used in different situations.
Setter – a method used to control changes to a variable. Constructor – a method used to initialise a new object.
20
Destructor – a method that is automatically invoked when an object is destroyed. Declarative programming – statements of facts and rules together with a mechanism for setting goals in the form of a query. Fact – a ‘thing’ that is known. Rules – relationships between facts.
A programming paradigm is a set of programming concepts. We have already considered two different programming paradigms: low-level and imperative (procedural) programming. The style and capability of any programming language is defined by its paradigm. Some programming languages, for example JavaScript, only follow one paradigm; others, for example Python, support multiple paradigms. Most programming languages are multi-paradigm. In this section of the chapter, we will consider four programming paradigms: low-level, imperative, objectoriented and declarative.
20.1 Programming paradigms
Containment (aggregation) – process by which one class can contain other classes.
Getter – a method that gets the value of a property.
20.1.1 Low-level programming Low-level programming uses instructions from the computer’s basic instruction set. Assembly language and machine code both use low-level instructions. This type of programming is used when the program needs to make use of specific addresses and registers in a computer, for example when writing a printer driver. In Chapter 4, Section 4.2.4, we looked at addressing modes. These are also covered by the Cambridge International A Level syllabus. Review Section 4.2.4 before completing Activity 20A.
499
457591_20_CI_AS & A_Level_CS_498-540.indd 499
4/30/19 8:03 AM
20 Further programming
20
ACTIVITY 20A A section of memory in a computer contains these denary values: Address 230 231 232 233 234 235
Denary value 231 5 7 9 11 0
Give the value stored in the accumulator (ACC) and the index register (IX) after each of these instructions have been executed and state the mode of addressing used. Address Opcode Operand 500 LDM #230 501 LDD 230 502 LDI 230 503 LDR #1 504 LDX 230 505 CMP #0 506 JPE 509 507 INC IX 508 JMP 504 509 JMP 509 // this stops the program, it executes the same instruction until the computer is turned off!
20.1.2 Imperative programming In imperative programming, the steps required to execute a program are set out in the order they need to be carried out. This programming paradigm is often used in the early stages of teaching programming. Imperative programming is often developed into structured programming, which has a more logical structure and makes use of procedures and functions, together with local and global variables. Imperative programming is also known as procedural programming. Programs written using the imperative paradigm may be smaller and take less time to execute than programs written using the object-oriented or declarative paradigms. This is because there are fewer instructions and less data storage is required for the compiled object code. Imperative programming works well for small, simple programs. Programs written using this methodology can be easier for others to read and understand. In Chapter 11, Section 11.3, we looked at structured programming. This is also covered by the Cambridge International A Level syllabus. Review Section 11.3 then complete Activity 20B. 500
457591_20_CI_AS & A_Level_CS_498-540.indd 500
4/30/19 8:04 AM
ACTIVITY 20B Write a pseudocode algorithm to calculate the areas of five different shapes (square, rectangle, triangle, parallelogram and circle) using the basic imperative programming paradigm (no procedures or functions, and using only global variables).
Write and test both algorithms using the programming language of your choice.
20.1.3 Object-oriented programming (OOP) Object-oriented programming (OOP) is a programming methodology that uses self-contained objects, which contain programming statements (methods) and data, and which communicate with each other. This programming paradigm is often used to solve more complex problems as it enables programmers to work with real life things. Many procedural programming languages have been developed to support OOP. For example, Java, Python and Visual Basic all allow programmers to use either procedural programming or OOP.
20.1 Programming paradigms
Rewrite the pseudocode algorithm in a more structured way using the procedural programming paradigm (make sure you use procedures, functions, and local and global variables).
20
Object-oriented programming uses its own terminology, which we will explore here.
Class A class is a template defining the methods and data of a certain type of object. The attributes are the data items in a class. A method is a programmed procedure that is defined as part of a class. Putting the data and methods together as a single unit, a class, is called encapsulation. To ensure that only the methods declared can be used to access the data within a class, attributes need to be declared as private and the methods need to be declared as public. For example, a shape can have name, area and perimeter as attributes and the methods set shape, calculate area, calculate perimeter. This information can be shown in a class diagram (Figure 20.1). class name attributes declared as private attributes declared as public
Shape : STRING : REAL : REAL
Name Area Perimeter SetShape () calculateArea () calculatePerimeter ()
▲ Figure 20.1 Shape class diagram
501
457591_20_CI_AS & A_Level_CS_498-540.indd 501
26/04/19 9:05 AM
Object When writing a program, an object needs to be declared using a class type that has already been defined. An object is an instance of a class that is selfcontained and includes data and methods. Properties of an object are the data and methods within an object that perform named actions. An occurrence of an object during the execution of a program is called an instance.
20 20 Further programming
For example, a class employee is defined and the object myStaff is instanced in these programs using Python, VB and Java. Python
class employee: def __init__ (self, name, staffno): self.name = name self.staffno = staffno
class definition
def showDetails(self): print("Employee Name " + self.name) print("Employee Number " , self.staffno) myStaff = employee("Eric Jones", 72)
object
myStaff.showDetails() VB
Module Module1 Public Sub Main() object
Dim myStaff As New employee("Eric Jones", 72) myStaff.showDetails() End Sub class employee: Dim name As String Dim staffno As Integer Public Sub New (ByVal n As String, ByVal s As Integer) name = n staffno = s End Sub
Class definition
Public Sub showDetails() Console.Writeline("Employee Name " & name) Console.Writeline("Employee Number " & staffno) Console.ReadKey() End Sub End Class End Module
502
457591_20_CI_AS & A_Level_CS_498-540.indd 502
26/04/19 9:05 AM
Java
class employee { String name;
20
int staffno; employee(String n, int s){ name = n; } Class definition
void showDetails (){ System.out.println("Employee Name " + name); System.out.println("Employee Number " + staffno); } public static void main(String[] args) {
20.1 Programming paradigms
staffno = s;
Dim myStaff As New employee("Eric Jones", 72)
object
myStaff.showDetails(); } } Data hiding protects the integrity of an object by restricting access to the data and methods within that object. One way of achieving data hiding in OOP is to use encapsulation. Data hiding reduces the complexity of programming and increases data protection and the security of data. Here is an example of a definition of a class with private attributes in Python, VB and Java. Python
class employee: def __init__(self, name, staffno): attributes are private method is public use of __ denotes private in Python
self.__name = name self.__staffno = staffno def showDetails(self): print("Employee Name " + self.__name) print("Employee Number " , self.__staffno)
503
457591_20_CI_AS & A_Level_CS_498-540.indd 503
26/04/19 9:05 AM
20
VB
class employee: Private name As String
Attributes are private
Private staffno As Integer Public Sub New (ByVal n As String, ByVal s As Integer) 20 Further programming
name = n
Constructor to set attributes
staffno = s End Sub Public Sub showDetails()
Methods are public
Console.Writeline("Employee Name "
& name)
Console.Writeline("Employee Number " & staffno) Console.ReadKey() End Sub End Class Java
// Java employee OOP program class employee { private String name;
Attributes are private
private int staffno; employee(String n, int s){ name = n;
Constructor to set attributes
staffno = s; } public void showDetails (){
Methods are public
System.out.println("Employee Name " + name); System.out.println("Employee Number " + staffno); } } public class MainObject{ public static void main(String[] args) { employee myStaff = new employee("Eric Jones", 72); myStaff.showDetails(); } }
504
457591_20_CI_AS & A_Level_CS_498-540.indd 504
26/04/19 9:05 AM
ACTIVITY 20C Write a short program to declare a class, student, with the private attributes name, dateOfBirth and examMark, and the public method displayExamMark. Declare an object myStudent, with a name and exam mark of your choice, and use your method to display the exam mark.
Figure 20.2 shows single inheritance, in which a derived class inherits from a single superclass. superclass
square
shape
rectangle
triangle
derived classes
parallelogram
20.1 Programming paradigms
Inheritance Inheritance is the process by which the methods and data from one class, a superclass or base class, are copied to another class, a derived class.
20
circle
▲ Figure 20.2 Inheritance diagram – single inheritance
Multiple inheritance is where a derived class inherits from more than one superclass (Figure 20.3). superclass 1
superclass 2
derived class
▲ Figure 20.3 Inheritance diagram – multiple inheritance
EXTENSION ACTIVITY 20A Not all programming languages support multiple inheritance. Check if the language you are using does.
Here is an example that shows the use of inheritance. A base class employee and the derived classes partTime and fullTime are defined. The objects permanentStaff and temporaryStaff are instanced in these examples and use the method showDetails.
505
457591_20_CI_AS & A_Level_CS_498-540.indd 505
26/04/19 9:05 AM
20
Python base class employee
class employee: def __init__ (self, name, staffno): self.__name = name self.__staffno = staffno
20 Further programming
self.__fullTimeStaff = True def showDetails(self): print("Employee Name " + self.__name) print("Employee Number " , self.__staffno)
derived class partTime
class partTime(employee): def __init__(self, name, staffno): employee.__init__(self, name, staffno) self.__fullTimeStaff = False self.__hoursWorked = 0 def getHoursWorked (self): return(self.__hoursWorked)
derived class fullTime
class fullTime(employee): def __init__(self, name, staffno): employee.__init__(self, name, staffno) self.__fullTimeStaff = True self.__yearlySalary = 0 def getYearlySalary (self): return(self.__yearlySalary) permanentStaff = fullTime("Eric Jones", 72) permanentStaff.showDetails() temporaryStaff = partTime ("Alice Hue", 1017) temporaryStaff.showDetails ()
VB
'VB Employee OOP program with inheritance Module Module1 Public Sub Main() Dim permanentStaff As New fullTime("Eric Jones", 72, 50000.00) permanentStaff.showDetails() Dim temporaryStaff As New partTime("Alice Hu", 1017, 45) temporaryStaff.showDetails() End Sub 506
457591_20_CI_AS & A_Level_CS_498-540.indd 506
26/04/19 9:05 AM
class employee
base class employee
20
Protected name As String Protected staffno As Integer Private fullTimeStaff As Boolean Public Sub New (ByVal n As String, ByVal s As Integer) name = n
20.1 Programming paradigms
staffno = s End Sub Public Sub showDetails() Console.Writeline("Employee Name "
& name)
Console.Writeline("Employee Number " & staffno) Console.ReadKey() End Sub End Class class partTime : inherits employee
derived class partTime
Private ReadOnly fullTimeStaff = false Private hoursWorked As Integer Public Sub New (ByVal n As String, ByVal s As Integer, ByVal h As Integer) MyBase.new (n, s) hoursWorked = h End Sub Public Function getHoursWorked () As Integer Return (hoursWorked) End Function End Class class fullTime : inherits employee
derived class fullTime
Private ReadOnly fullTimeStaff = true Private yearlySalary As Decimal Public Sub New (ByVal n As String, ByVal s As Integer, ByVal y As Decimal) MyBase.new (n, s) yearlySalary = y End Sub Public Function getYearlySalary () As Decimal Return (yearlySalary) End Function End Class End Module
507
457591_20_CI_AS & A_Level_CS_498-540.indd 507
26/04/19 9:05 AM
20
Java
// Java employee OOP program with inheritance class employee {
base class employee
private String name; private int staffno; private boolean fullTimeStaff; 20 Further programming
employee(String n, int s){ name = n; staffno = s; } public void showDetails (){ System.out.println("Employee Name " + name); System.out.println("Employee Number " + staffno); } } class partTime extends employee {
derived class partTime
private boolean fullTimeStaff = false; private int hoursWorked; partTime (String n, int s, int h){ super (n, s); hoursWorked = h; } public int getHoursWorked () { return hoursWorked; } } class fullTime extends employee {
derived class fullTime
private boolean fullTimeStaff = true; private double yearlySalary; fullTime (String n, int s, double y){ super (n, s); yearlySalary = y; } public double getYearlySalary () { return yearlySalary; } } 508
457591_20_CI_AS & A_Level_CS_498-540.indd 508
26/04/19 9:05 AM
public class MainInherit{ public static void main(String[] args) { fullTime permanentStaff = new fullTime("Eric Jones", 72, 50000.00);
20
permanentStaff.showDetails(); partTime temporaryStaff = new partTime("Alice Hu", 1017, 45); temporaryStaff.showDetails(); } Figure 20.4 shows the inheritance diagram for the base class employee and the derived classes partTime and fullTime. all employees have these attributes
name staffNo
all employees have these methods
only part time employees have these attributes and methods
employee : STRING : INTEGER
fullTimeStaff : BOOLEAN showDetails ()
partTime hoursWorked : INTEGER getHoursWorked ()
only full time employees have these attributes and methods
20.1 Programming paradigms
}
fullTime yearlySalary : REAL GetYearlySalary ()
▲ Figure 20.4 Inheritance diagram for employee, partTime and fullTime
ACTIVITY 20D Write a short program to declare a class, student, with the private attributes name, dateOfBirth and examMark, and the public method displayExamMark. Declare the derived classes fullTimeStudent and partTimeStudent. Declare objects for each derived class, with a name and exam mark of your choice, and use your method to display the exam marks for these students.
Polymorphism and overloading Polymorphism is when methods are redefined for derived classes. Overloading is when a method is defined more than once in a class so it can be used in different situations. Example of polymorphism A base class shape is defined, and the derived classes rectangle and circle are defined. The method area is redefined for both the rectangle class and the circle class. The objects myRectangle and myCircle are instanced in these programs. 509
457591_20_CI_AS & A_Level_CS_498-540.indd 509
26/04/19 9:05 AM
20
Python
class shape: def __init__(self): self.__areaValue = 0 self.__perimeterValue = 0
20 Further programming
def area(self):
original method in shape class
print("Area ", self.__areaValue) def perimeter(self): print("Perimeter ", self.__areaValue) class rectangle(shape): def __init__(self, length, breadth): shape.__init__(self) self.__length = length self.__breadth = breadth def area (self):
redefined method in rectangle class
self.__areaValue = self.__length * self.__breadth print("Area ", self.__areaValue) class circle(shape): def __init__(self, radius): shape.__init__(self) self.__radius = radius def area (self):
redefined method in circle class
self.__areaValue = self.__radius * self.__radius * 3.142 print("Area ", self.__areaValue) myCircle = circle(20) myCircle.area() myRectangle = rectangle (10,17) myRectangle.area() VB
'VB shape OOP program with polymorphism Module Module1 Public Sub Main() Dim myCircle As New circle(20) myCircle.area() Dim myRectangle As New rectangle(10,17)
510
457591_20_CI_AS & A_Level_CS_498-540.indd 510
26/04/19 9:05 AM
myRectangle.area()
20
Console.ReadKey() End Sub class shape Protected areaValue As Decimal Protected perimeterValue As Decimal original method in shape class
Sub area()
Console.Writeline("Area "
& areaValue)
End Sub Overridable Sub perimeter() Console.Writeline("Perimeter "
& perimeterValue)
End Sub End Class class rectangle :
20.1 Programming paradigms
Overridable
inherits shape
Private length As Decimal Private breadth As Decimal Public Sub New (ByVal l As Decimal, ByVal b As Decimal) length = l breadth = b End Sub redefined method in rectangle class
Overrides Sub Area () areaValue = length * breadth Console.Writeline("Area "
& areaValue)
End Sub End Class class circle : inherits shape Private radius As Decimal Public Sub New (ByVal r As Decimal) radius = r End Sub Overrides Sub Area ()
redefined method in circle class
areaValue = radius * radius * 3.142 Console.Writeline("Area "
& areaValue)
End Sub End Class End Module
511
457591_20_CI_AS & A_Level_CS_498-540.indd 511
26/04/19 9:05 AM
20
Java
// Java shape OOP program with polymorphism original method in shape class class shape { protected double areaValue; protected double perimeterValue;
20 Further programming
public void area (){ System.out.println("Area " + areaValue); } } class rectangle extends shape { private double length; private double breadth; rectangle(double l, double b){ length = l; breadth = b; } public void area (){
redefined method in rectangle class
areaValue = length * breadth; System.out.println("Area " + areaValue); } } class circle extends shape { private double radius; circle (double r){ radius = r; } public void area (){
redefined method in circle class
areaValue = radius * radius * 3.142; System.out.println("Area " + areaValue); } } public class MainShape{ public static void main(String[] args) { circle myCircle = new circle(20); myCircle.area(); rectangle myRectagle = new rectangle(10, 17); myRectagle.area(); } } 512
457591_20_CI_AS & A_Level_CS_498-540.indd 512
26/04/19 9:05 AM
ACTIVITY 20E Write a short program to declare the class shape with the public method area.
20
Declare the derived classes circle, rectangle and square. Use polymorphism to redefine the method area for these derived classes. Declare objects for each derived class and instance them with suitable data.
Example of overloading One way of overloading a method is to use the method with a different number of parameters. For example, a class greeting is defined with the method hello. The object myGreeting is instanced and uses this method with no parameters or one parameter in this Python program. This is how Python, VB and Java manage overloading. Python
20.1 Programming paradigms
Use your methods to display the areas for these shapes.
class greeting: def hello(self, name = None): if name is not None: print ("Hello " + name) else: print ("Hello") myGreeting = greeting() myGreeting.hello() myGreeting.hello("Christopher")
method used with no parameters method used with one parameter
VB
Module Module1 Public Sub Main() Dim myGreeting As New greeting myGreeting.hello()
method used with no parameters
myGreeting.hello("Christopher") Console.ReadKey()
method used with one parameter
End Sub Class greeting Public Overloads Sub hello() Console.WriteLine("Hello") End Sub Public Overloads Sub hello(ByVal name As String) Console.WriteLine("Hello " & name) 513
457591_20_CI_AS & A_Level_CS_498-540.indd 513
26/04/19 9:05 AM
20
End Sub End Class End Module Java
20 Further programming
class greeting{ public void hello(){ System.out.println("Hello"); } public void hello(String name){ System.out.println("Hello " + name); } } class mainOverload{ public static void main(String args[]){ greeting myGreeting = new greeting(); myGreeting.hello();
method used with no parameters
myGreeting.hello("Christopher"); }
method used with one parameter
}.
ACTIVITY 20F Write a short program to declare the class greeting, with the public method hello, which can be used without a name, with one name or with a first name and last name. Declare an object and use the method to display each type of greeting.
Containment Containment, or aggregation, is the process by which one class can contain other classes. This can be presented in a class diagram. When the class ‘aeroplane’ is defined, and the definition contains references to the classes – seat, fuselage, wing, cockpit – this is an example of containment. aeroplane
seat
fuselage
wing
cockpit
▲ Figure 20.5
When deciding whether to use inheritance or containment, it is useful to think about how the classes used would be related in the real world. 514
457591_20_CI_AS & A_Level_CS_498-540.indd 514
26/04/19 9:05 AM
For example » when looking at shapes, a circle is a shape – so inheritance would be used » when looking at the aeroplane, an aeroplane contains wings – so containment would be used.
20
Consider the people on board an aeroplane for a flight. The containment diagram could look like this if there can be up to 10 crew and 350 passengers on board: flightID : STRING numberOfCrew : INTEGER flightCrew [1 : 10] OF crew numberOfPassengers : INTEGER flightPassengers [1 : 350] OF passenger : : addCrew () addPassenger () removeCrew () removePassenger : :
crew firstName : STRING lastName : STRING role : STRING : : showCrewDetails () : :
20.1 Programming paradigms
flight
passenger firstName : STRING lastName : STRING seatNumber : STRING : : showPassengerDetails () : :
▲ Figure 20.6
ACTIVITY 20G Draw a containment diagram for a course at university where there are up to 50 lectures, three examinations and the final mark is the average of the marks for the three examinations.
Object methods In OOP, the basic methods used during the life of an object can be divided into these types: constructors, setters, getters, and destructors. A constructor is the method used to initialise a new object. Each object is initialised when a new instance is declared. When an object is initialised, memory is allocated. 515
457591_20_CI_AS & A_Level_CS_498-540.indd 515
26/04/19 9:05 AM
20 Further programming
20
For example, in the first program in Chapter 20, this is the method used to construct a new employee object. Constructor
Language
def __init __(self, name, staffno): self. __name = name self. __staffno = staffno
Python
Public Sub New (ByVal n As String, ByVal s As Integer) VB name = n staffno = s End Sub ▲ Table 20.1
Constructing an object
Language
myStaff = employee("Eric Jones",72)
Python
Dim myStaff As New employee("Eric Jones", 72)
VB
employee myStaff = new employee("Eric Jones", 72);
Java
▲ Table 20.2
A setter is a method used to control changes to any variable that is declared within an object. When a variable is declared as private, only the setters declared within the object’s class can be used to make changes to the variable within the object, thus making the program more robust. For example, in the employee base class, this code is a setter: Setter
Language
def setName(self, n): self. __ name = n
Python
Public Sub setName (ByVal n As String) name = n End Sub
VB
public void setName(String n){ this.name = n; }
Java
▲ Table 20.3
A getter is a method that gets the value of a property of an object. For example, in the partTimeStaff derived class, this method is a getter: Getter
Language
def getHoursWorked (self): return(self. __ hoursWorked)
Python
Public Function getHoursWorked () As Integer Return (hoursWorked)
VB
public int getHoursWorked () { return hoursWorked;}
Java
▲ Table 20.4 516
457591_20_CI_AS & A_Level_CS_498-540.indd 516
26/04/19 9:05 AM
A destructor is a method that is invoked to destroy an object. When an object is destroyed the memory is released so that it can be reused. Both Java and VB use garbage collection to automatically destroy objects that are no longer used so that the memory used can be released. In VB garbage collection can be invoked as a method if required but it is not usually needed.
20
For example, in any of the Python programs above, this could be used as a destructor:
Here is an example of a destructor being used in a Python program: class shape: def __init__(self): self.__areaValue = 0 self.__perimeterValue = 0 def __del__(self): print("Shape deleted")
destructor
20.1 Programming paradigms
def __del__(self)
def area(self): print("Area ", self.__areaValue) def perimeter(self): print("Perimeter ", self.__areaValue) : : del myCircle
object destroyed
Here are examples of destructors in Python and VB. Destructor
Language
def __del __(self): print ("Object deleted")
Python
Protected Overrides Sub Finalize() Console.WriteLine("Object deleted") Console.ReadKey()
VB – only if required, automatically called at end of program Java – not used
▲ Table 20.5
Writing a program for a binary tree In Chapter 19, we looked at the data structure and some of the operations for a binary tree using fixed length arrays in pseudocode. You will need to be able to write a program to implement a binary tree, search for an item in the binary tree and store an item in a binary tree. Binary trees are best implemented using objects, constructors, containment, functions, procedures and recursion. » » » » »
Objects – tree and node Constructor – adding a new node to the tree Containment – the tree contains nodes Function – search the binary tree for an item Procedure – insert a new item in the binary tree 517
457591_20_CI_AS & A_Level_CS_498-540.indd 517
26/04/19 9:05 AM
20 Further programming
20
The data structures and operations to implement a binary tree for integer values in ascending order are set out in Tables 20.6–9 below. If you are unsure how the binary tree works, review Chapter 19. Binary tree data structure – Class node
Language
class Node: def __init __(self, item): self.left = None self.right = None self.item = item
Python – the values for new nodes are set here. Python uses None for null pointers
Public Class Node Public item As Integer Public left As Node Public right As Node Public Function GetNodeItem() Return item End Function End Class
VB with a recursive definition of node to allow for a tree of any size
class Node { int item; Node left; Node right; GetNodeItem(int item) { this.item = item; } }
Java with a recursive definition of node to allow for a tree of any size
▲ Table 20.6
Binary tree data structure – Class tree
Language
tree = Node(27)
Python – the root of the tree is set as an instance of Node
Public Class BinaryTree Public root As Node Public Sub New() root = Nothing End Sub End Class
VB uses Nothing for null pointers
class BinaryTree { Node root; BinaryTree(int item) { this.item = item; } }
Java uses null for null pointers
▲ Table 20.7 518
457591_20_CI_AS & A_Level_CS_498-540.indd 518
26/04/19 9:05 AM
Language
def insert(self, item): if self.item: if item < self.item: if self.left is None: self.left = Node(item) else: self.left.insert(item) elif item > self.item: if self.right is None: self.right = Node(item) else: self.right.insert(item) else: self.item = item
Python showing a recursive procedure to insert a new node and the pointers to it
Public Sub insert(ByVal item As Integer) Dim newNode As New Node() if root Is Nothing Then root = newNode Else Dim CurrentNode As Node = root If item < current.item Then If current.left Is Nothing Then current.left = Node(item) Else current.left.insert(item) End If Else If If item > current.item Then If current.right Is Nothing Then current.right = Node(item) Else current.right.insert(item) End If Else If current.item = item End If End If End Sub
VB showing a recursive procedure to insert a new node
20 20.1 Programming paradigms
Add integer to binary tree
519
457591_20_CI_AS & A_Level_CS_498-540.indd 519
26/04/19 9:05 AM
20 Further programming
20
Add integer to binary tree
Language
void insert(tree node, int item) { if (item < node.item) { if (node.left != null) insert(node.left, item); else node.left = new tree(item); } else if (item > node.item) { if (node.right != null) insert(node.right, item); else node.right = new tree(item); } }
Java showing a recursive procedure to insert a new node
▲ Table 20.8
Search for integer in binary tree
Language
def search(self, item): while self.item != item: if item < self.item: self.item = self.left else: self.item = self.right if self.item is None: return None return self.item
Python – the function returns the value searched for if it is found, otherwise it returns None
Public Function search(ByVal item As Integer) As Integer Dim current As Node = root While current.item item If item < current.item Then current = current.left Else current = current.right End If If current Is Nothing Then Return Nothing End If End While Return current.item End Function
VB – the function returns the value searched for if it is found, otherwise it returns Nothing
520
457591_20_CI_AS & A_Level_CS_498-540.indd 520
26/04/19 9:05 AM
Language
tree search(int item, tree node) { while (item node.item) { if(item < node.item) node = node.left; else node = node.right; if (node = null) return null; } return node; }
Java – the function returns the value searched for if it is found, otherwise it returns null
▲ Table 20.9
20 20.1 Programming paradigms
Search for integer in binary tree
ACTIVITY 20H In your chosen programming language, write a program using objects and recursion to implement a binary tree. Test your program by setting the root of the tree to 27, then adding the integers 19, 36, 42 and 16 in that order.
EXTENSION ACTIVITY 20B Complete a pre-order and post-order traverse of your binary tree and print the results.
20.1.4 Declarative programming Declarative programming is used to extract knowledge by the use of queries from a situation with known facts and rules. In Chapter 8, Section 8.3 we looked at the use of SQL scripts to query relational databases. It can be argued that SQL uses declarative programming. Review Section 8.3 to remind yourself how SQL performs queries. Here is an example of an SQL query from Chapter 8: SELECT FirstName, SecondName FROM Student WHERE ClassID = '7A' ORDER BY SecondName Declarative programming uses statements of facts and rules together with a mechanism for setting goals in the form of a query. A fact is a ‘thing’ that is known, and rules are relationships between facts. Writing declarative programs is very different to writing imperative programs. In imperative programming, the programmer writes a list of statements in the order that they will be performed. But in declarative programming, the programmer can state the facts and rules in any order before writing the query. 521
457591_20_CI_AS & A_Level_CS_498-540.indd 521
26/04/19 9:05 AM
20
Prolog is a declarative programming language that uses predicate logic to write facts and rules. For example, the fact that France is a country would be written in predicate logic as: country(france).
20 Further programming
Note that all facts in Prolog use lower-case letters and end with a full stop. Another fact about France – the language spoken in France is French – could be written as: language(france,french). A set of facts could look like this: country(france). country(germany). country(japan). country(newZealand). country(england). country(switzerland). language(france,french). language(germany,german). language(japan,japanese). language(newZealand,english). language(england,english). language(switzerland,french). language(switzerland,german). language(switzerland,italian). These facts are used to set up a knowledge base. This knowledge base can be consulted using queries. For example, a query about countries that speak a certain language, English, could look like this: language(Country,english) Note that a variable in Prolog – Country, in this example – begins with an uppercase-letter. This would give the following results: newZealand ; england. The results are usually shown in the order the facts are stored in the knowledge base. A query about the languages spoken in a country, Switzerland, could look like this: language(switzerland,Language). And these are the results: french, german, italian. 522
457591_20_CI_AS & A_Level_CS_498-540.indd 522
26/04/19 9:05 AM
When a query is written as a statement, this statement is called a goal and the result is found when the goal is satisfied using the facts and rules available.
ACTIVITY 20I
20
Use the facts above to write queries to find out which language is spoken in England and which country speaks Japanese. Take care with the use of capital letters.
Download SWI-Prolog and write a short program to provide facts about other countries and languages and save the file. Then consult the file to find out which languages are spoken in some of the countries. Note that SWIprolog is available as a free download.
The results for the country Switzerland query would look like this in SWI-Prolog: prompt
20.1 Programming paradigms
EXTENSION ACTIVITY 20C
?– 1anguage(switzerland,Language). Language = french ;
press ; to get the next result
Language = german ; Language = italian. Most knowledge bases also make use of rules, which are also written using predicate logic. Here is a knowledge base for the interest paid on bank accounts. The facts about each account include the name of the account holder, the type of account (current or savings), and the amount of money in the account. The facts about the interest rates are the percentage rate, the type of account and the amount of money needed in the account for that interest rate to be paid. bankAccount(laila,current,500.00). bankAccount(stefan,savings,50). bankAccount(paul,current,45.00). bankAccount(tasha,savings,5000.00). interest(twoPercent,current,500.00).
facts
interest(onePercent,current,0). interest(tenPercent,savings,5000.00). interest(fivePercent,savings,0). savingsRate(Name,Rate) :bankAccount(Name,Type,Amount),
rule for the rate of interest to be used
interest(Rate,Type,Base), Amount >= Base.
523
457591_20_CI_AS & A_Level_CS_498-540.indd 523
26/04/19 9:05 AM
20
Here is an example of a query using the above rule: savingsRate(stefan,X). And here is the result: fivePercent
20 Further programming
Here are examples of queries to find bank account details: bankAccount(laila,X,Y). bankAccount(victor,X,Y) And here are the results: current, 500.0 false
ACTIVITY 20J Carry out the following activities using the information above. 1 Write a query to find out the interest rate for Laila’s bank account. 2 Write a query to find who has savings accounts. 3 a) Set up a savings account for Robert with 300.00. b) Set up a new fact about savings accounts allowing for an interest rate of 7% if there is 2000.00 or more in a savings account.
EXTENSION ACTIVITY 20D Use SWI-Prolog to check your answers to the previous activity.
ACTIVITY 20K 1 Explain the difference between the four modes of addressing in a low-level programming language. Illustrate your answer with assembly language code for each mode of addressing. 2 Compare and contrast the use of imperative (procedural) programming with OOP. Use the shape programs you developed in Activities 20B and 20E to illustrate your answer with examples to show the difference in the paradigms. 3 Use the knowledge base below to answer the following questions:
language(fortran,highLevel). language(cobol,highLevel). language(visualBasic,highLevel). language(visualBasic,oop). language(python,highLevel). language(python,oop). language(assembly,lowLevel). language(masm,lowLevel). translator(assembler,lowLevel). translator(compiler,highLevel).
524
457591_20_CI_AS & A_Level_CS_498-540.indd 524
26/04/19 9:05 AM
teaching(X):language(X,oop), language(X,highLevel).
20.2 File processing and exception handling WHAT YOU SHOULD ALREADY KNOW In Chapter 10, Section 10.3, you learnt about text files, and in Chapter 13, Section 13.2, you learnt about file organisation and access. Review these sections, then try these three questions before you read the second part of this chapter. 1 a) Write a program to set up a text file to store records like this, with one record on every line. TYPE TstudentRecord DECLARE name : STRING DECLARE address : STRING DECLARE className : STRING ENDTYPE
b) Write a procedure to append a record. c) Write a procedure to find and delete a record. d) Write a procedure to output all the records. 2 Describe three types of file organisation 3 Describe two types of file access and explain which type of files each one is used for.
20.2 File processing and exception handling
a) Write two new facts about Java, showing that it is a high-level language and uses OOP. b) Show the results from these queries i) teaching(X). ii) teaching(masm). c) Write a query to show all programming languages translated by an assembler.
20
Key terms Read – file access mode in which data can be read from a file.
Close – file-processing operation; closes a file so it can no longer be used by a program.
Write – file access mode in which data can be written to a file; any existing data stored in the file will be overwritten.
Exception – an unexpected event that disrupts the execution of a program.
Append – file access mode in which data can be added to the end of a file.
Exception handling – the process of responding to an exception within the program so that the program does not halt unexpectedly.
Open – file-processing operation; opens a file ready to be used in a program.
525
457591_20_CI_AS & A_Level_CS_498-540.indd 525
26/04/19 9:05 AM
20
20.2.1 File processing operations
Files are frequently used to store records that include data types other than string. Also, many programs need to handle random access files so that a record can be found quickly without reading through all the preceding records. A typical record to be stored in a file could be declared like this in pseudocode:
20 Further programming
TYPE TstudentRecord DECLARE name : STRING DECLARE registerNumber : INTEGER DECLARE dateOfBirth : DATE DECLARE fullTime : BOOLEAN ENDTYPE
Storing records in a serial or sequential file The algorithm to store records sequentially in a serial (unordered) or sequential (ordered on a key field) file is very similar to the algorithm for storing lines of text in a text file. The algorithm written in pseudocode below stores the student records sequentially in a serial file as they are input. Note that PUTRECORD is the pseudocode to write a record to a data file and GETRECORD is the pseudocode to read a record from a data file. DECLARE studentRecord : ARRAY[1:50] OF TstudentRecord DECLARE studentFile : STRING DECLARE counter : INTEGER counter ← 1 studentFile ← "studentFile.dat" OPEN studentFile FOR WRITE REPEAT OUTPUT "Please enter student details" OUTPUT "Please enter student name" INPUT studentRecord.name[counter] IF studentRecord.name "" THEN OUTPUT "Please enter student’s register number" INPUT studentRecord.registerNumber[counter] OUTPUT "Please enter student’s date of birth" INPUT studentRecord.dateOfBirth[counter] UTPUT "Please enter True for fulltime or O False for part-time" INPUT studentRecord.fullTime[counter] PUTRECORD, studentRecord[counter] counter ← counter + 1
526
457591_20_CI_AS & A_Level_CS_498-540.indd 526
26/04/19 9:05 AM
ELSE
20
CLOSEFILE(studentFile) ENDIF UNTIL studentRecord.name = "" OUTPUT "The file contains these records: " OPEN studentFile FOR READ
20.2 File processing and exception handling
counter ← 1 REPEAT GETRECORD, studentRecord[counter] OUTPUT studentRecord[counter] counter ← counter + 1 UNTIL EOF(studentFile) CLOSEFILE(studentFile) Identifier name studentRecord
Description
studentFile
File name
counter
Counter for records
Array of records to be written to the file
▲ Table 20.10
If a sequential file was required, then the student records would need to be input into an array of records first, then sorted on the key field registerNumber, before the array of records was written to the file. Here are programs in Python, VB and Java to write a single record to a file. Python import pickle
Library to use binary files
class student: def __init __(self): self.name = "" self.registerNumber = 0 self.dateOfBirth = datetime.datetime.now() self.fullTime = True studentRecord = student() studentFile = open('students.DAT','w+b')
Create a binary file to store the data
print("Please enter student details") studentRecord.name = input("Please enter student name
")
studentRecord.registerNumber = int(input("Please enter student's register number
"))
year = int(input("Please enter student's year of birth YYYY ")) month = int(input("Please enter student's month of birth MM ")) day = int(input("Please enter student's day of birth DD ")) 527
457591_20_CI_AS & A_Level_CS_498-540.indd 527
26/04/19 9:05 AM
20
studentRecord.dateOfBirth = datetime.datetime(year, month, day) studentRecord.fullTime = bool(input("Please enter True for full-time or False for part-time "))
Write record to file
pickle.dump (studentRecord, studentFile)
print(studentRecord.name, studentRecord.registerNumber, studentRecord.dateOfBirth,
20 Further programming
studentRecord.fullTime) studentFile.close() studentFile = open('students.DAT','rb')
Open binary file to read from
studentRecord = pickle.load(studentFile)
Read record from file
print(studentRecord.name, studentRecord.registerNumber, studentRecord.dateOfBirth, studentRecord.fullTime) studentFile.close()
VB Option Explicit On Imports System.IO
Library to use Input and Output
Module Module1 Public Sub Main() Dim studentFileWriter As BinaryWriter Dim studentFileReader As BinaryReader Dim studentFile As FileStream Dim year, month, day As Integer
Create a file to store the data
Dim studentRecord As New student() studentFile = New FileStream("studentFile.DAT", FileMode.Create) studentFileWriter = New BinaryWriter(studentFile) Console.Write("Please enter student name ") studentRecord.name = Console.ReadLine() Console.Write("Please enter student's register number ") studentRecord.registerNumber = Integer.Parse(Console.ReadLine()) Console.Write("Please enter student's year of birth YYYY ") year =Integer.Parse(Console.ReadLine()) Console.Write("Please enter student's month of birth MM ") month =Integer.Parse(Console.ReadLine()) Console.Write("Please enter student's day of birth DD ") day =Integer.Parse(Console.ReadLine()) studentRecord.dateOfBirth = DateSerial(year, month, day) Console.Write("Please enter True for full-time or False for part-time ") studentRecord.fullTime = Boolean.Parse(Console.ReadLine()) 528
457591_20_CI_AS & A_Level_CS_498-540.indd 528
26/04/19 9:05 AM
studentFileWriter.Write(studentRecord.name) studentFileWriter.Write(studentRecord.registerNumber)
Write record to file
studentFileWriter.Write(studentRecord.dateOfBirth)
20
studentFileWriter.Write(studentRecord.fullTime) studentFileWriter.Close()
Open file to read from
studentFile.Close()
studentFileReader = New BinaryReader(studentFile) studentRecord.name = studentFileReader.ReadString() studentRecord.registerNumber = studentFileReader.ReadInt32() studentRecord.dateOfBirth = studentFileReader.ReadString()
Read record from file
studentRecord.fullTime = studentFileReader.ReadBoolean() studentFileReader.Close() studentFile.Close() Console.WriteLine (studentRecord.name & " " & studentRecord.registerNumber & " " & studentRecord.dateOfBirth & " " & studentRecord.fullTime)
20.2 File processing and exception handling
studentFile = New FileStream("studentFile.DAT", FileMode.Open)
Console.ReadKey () End Sub class student: Public name As String Public registerNumber As Integer Public dateOfBirth As Date Public fullTime As Boolean End Class End Module
Java ( Java programs using files need to include exception handling – see Section 20.2.2 later in this chapter.) import java.io.File; import java.io.FileWriter; import java.util.Scanner; import java.util.Date; import java.text.SimpleDateFormat; class Student { private String name; private int registerNumber; private Date dateOfBirth; 529
457591_20_CI_AS & A_Level_CS_498-540.indd 529
26/04/19 9:05 AM
private boolean fullTime;
20
Student(String name, int registerNumber, Date dateOfBirth, boolean fullTime) { this.name = name; this.registerNumber = registerNumber; this.dateOfBirth = dateOfBirth;
20 Further programming
this.fullTime = fullTime; } public String toString() { return name + " " + registerNumber + " " + dateOfBirth + " " + fullTime; } } public class StudentRecordFile { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Please Student details"); System.out.println("Please enter Student name "); String nameIn = input.next(); System.out.println("Please enter Student's register number "); int registerNumberIn = input.nextInt(); System.out.println("Please enter Student's date of birth as YYYY-MM-DD "); String DOBIn = input.next(); SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd"); Date dateOfBirthIn = format.parse(DOBIn); System.out.println("Please enter true for full-time or false for part-time "); boolean fullTimeIn = input.nextBoolean(); Student studentRecord = new Student(nameIn, registerNumberIn, dateOfBirthIn, fullTimeIn); System.out.println(studentRecord.toString()); // This is the file that we are going to write to and then read from File studentFile = new File("Student.txt"); // Write the record to the student file // Note - this try-with-resources syntax only works with Java 7 and later try (FileWriter studentFileWriter = new FileWriter(studentFile)) { studentFileWriter.write(studentRecord.toString()); } // Print all the lines of text in the student file try (Scanner studentReader = new Scanner(studentFile)) { 530
457591_20_CI_AS & A_Level_CS_498-540.indd 530
26/04/19 9:05 AM
while (studentReader.hasNextLine()) { String data = studentReader.nextLine(); System.out.println(data);
20
} } }
ACTIVITY 20L In the programming language of your choice, write a program to n input a student record and save it to a new serial file n read a student record from that file n extend your program to work for more than one record.
EXTENSION ACTIVITY 20E In the programming language of your choice, extend your program to sort the records on registerNumber before storing in the file.
20.2 File processing and exception handling
}
Adding a record to a sequential file Records can be appended to the end of a serial file by opening the file in append mode. If records need to be added to a sequential file, then the whole file needs to be recreated and the record stored in the correct place. The algorithm written in pseudocode below inserts a student record into the correct place in a sequential file. DECLARE studentRecord : TstudentRecord DECLARE newStudentRecord : TstudentRecord DECLARE studentFile : STRING DECLARE newStudentFile : STRING DECLARE recordAddedFlag : BOOLEAN recordAddedFlag ← FALSE studentFile ← "studentFile.dat" newStudentFile ← "newStudentFile.dat" CREATE newStudentFile
// creates a new file to write to
OPEN newStudentFile FOR WRITE OPEN studentFile FOR READ OUTPUT "Please enter student details" OUTPUT "Please enter student name" INPUT newStudentRecord.name OUTPUT "Please enter student’s register number" 531
457591_20_CI_AS & A_Level_CS_498-540.indd 531
26/04/19 9:05 AM
20
INPUT newStudentRecord.registerNumber OUTPUT "Please enter student’s date of birth" INPUT newStudentRecord.dateOfBirth OUTPUT "Please enter True for full-time or False for part-time" INPUT newStudentRecord.fullTime
20 Further programming
REPEAT WHILE NOT recordAddedFlag OR EOF(studentFile) GETRECORD, studentRecord
// gets record from existing file
IF newStudentRecord.registerNumber > studentRecord.registerNumber THEN PUTRECORD studentRecord // writes record from existing file to new file ELSE PUTRECORD newStudentRecord // or writes new record to new file in the correct place recordAddedFlag ← TRUE ENDIF ENDWHILE IF EOF (studentFile) THEN PUTRECORD newStudentRecord // add new record at end of the new file ELSE REPEAT GETRECORD, studentRecord PUTRECORD studentRecord //transfers all remaining records to the new file ENDIF UNTIL EOF(studentRecord) CLOSEFILE(studentFile) CLOSEFILE(newStudentFile) DELETE(studentFile) // deletes old file of student records RENAME newStudentfile, studentfile // renames new file to be the student record file
532
457591_20_CI_AS & A_Level_CS_498-540.indd 532
26/04/19 9:05 AM
Identifier name studentRecord
Description
newStudentRecord
new record to be written to the file
studentFile
student file name
newStudentFile
temporary file name
20
record from student file
▲ Table 20.11
Opening a file in append mode
Language
myFile = open("fileName", "a")
Opens the file with the name fileName in append mode in Python
myFile = New FileStream("fileName", FileMode.Append) Opens the file with the name fileName in append mode in VB.NET FileWriter myFile = new FileWriter("fileName", true);
20.2 File processing and exception handling
Note that you can directly append a record to the end of a file in a programming language by opening the file in append mode, as shown in the table below.
Opens the file with the name fileName in append mode in Java
▲ Table 20.12
ACTIVITY 20M In the programming language of your choice, write a program to n input a student record and append it to the end of a sequential file n find and output a student record from a sequential file using the key field to identify the record n extend your program to check for record not found (if required).
EXTENSION ACTIVITY 20F Extend your program to input a student record and save it to in the correct place in the sequential file created in Extension Activity 20E.
Adding a record to a random file Records can be added to a random file by using a hashing function on the key field of the record to be added. The hashing function returns a pointer to the address where the record is to be added. 533
457591_20_CI_AS & A_Level_CS_498-540.indd 533
26/04/19 9:05 AM
20
In pseudocode, the address in the file can be found using the command: SEEK , The record can be stored in the file using the command: PUTRECORD ,
20 Further programming
Or it can be retrieved using: GETRECORD , The file needs to be opened as random: OPEN studentFile FOR RANDOM The algorithm written in pseudocode below inserts a student record into a random file. DECLARE studentRecord : TstudentRecord DECLARE studentFile : STRING DECLARE Address : INTEGER studentFile ← "studentFile.dat" OPEN studentFile FOR RANDOM // opens file for random access both read and write OUTPUT "Please enter student details" OUTPUT "Please enter student name" INPUT StudentRecord.name OUTPUT "Please enter student’s register number" INPUT studentRecord.registerNumber OUTPUT "Please enter student’s date of birth" INPUT studentRecord.dateOfBirth OUTPUT "Please enter True for full-time or False for part-time" INPUT studentRecord.fullTime address ← hash(studentRecord,registerNumber) // uses function hash to find pointer to address SEEK studentFile,address // finds address in file PUTRECORD studentFile,studentRecord //writes record to the file CLOSEFILE(studentFile)
534
457591_20_CI_AS & A_Level_CS_498-540.indd 534
26/04/19 9:05 AM
EXTENSION ACTIVITY 20G In the programming language of your choice, write a program to input a student record and save it to a random file.
DECLARE studentRecord : TstudentRecord DECLARE studentFile : STRING DECLARE Address : INTEGER studentFile ← "studentFile.dat" OPEN studentFile FOR RANDOM // opens file for random access both read and write OUTPUT "Please enter student’s register number" INPUT studentRecord.registerNumber address ← hash(studentRecord.registerNumber) // uses function hash to find pointer to address SEEK studentFile,address // finds address in file GETRECORD studentFile,studentRecord //reads record from the file OUTPUT studentRecord CLOSEFILE(studentFile)
20.2 File processing and exception handling
Finding a record in a random file Records can be found in a random file by using a hashing function on the key field of the record to be found. The hashing function returns a pointer to the address where the record is to be found, as shown in the example pseudocode below.
20
EXTENSION ACTIVITY 20H In the programming language of your choice, write a program to find and output a student record from a random file using the key field to identify the record.
20.2.2 Exception handling An exception is an unexpected event that disrupts the execution of a program. Exception handling is the process of responding to an exception within the program so that the program does not halt unexpectedly. Exception handling makes a program more robust as the exception routine traps the error then outputs an error message, which is followed by either an orderly shutdown of the program or recovery if possible. An exception may occur in many different ways, for example » » » »
dividing by zero during a calculation reaching the end of a file unexpectedly when trying to read a record from a file trying to open a file that has not been created losing a connection to another device, such as a printer. 535
457591_20_CI_AS & A_Level_CS_498-540.indd 535
26/04/19 9:05 AM
Exceptions can be caused by
20
» programming errors » user errors » hardware failure.
20 Further programming
Error handling is one of the most important aspects of writing robust programs that are to be used every day, as users frequently make errors without realising, and hardware can fail at any time. Frequently, error handling routines can take a programmer as long, or even longer, to write and test as the program to perform the task itself. The structure for error handling can be shown in pseudocode as: TRY
EXCEPT
ENDTRY Here are programs in Python, VB and Java to catch an integer division by zero exception. Python
def division(firstNumber, secondNumber): integer division // try: myAnswer = firstNumber // secondNumber print('Answer ', myAnswer) except: print('Divide by zero') division(12, 3) division(10, 0) VB
Module Module1 Public Sub Main() division(12, 3) division(10, 0) Console.ReadKey() End Sub Sub division(ByVal firstNumber As Integer, ByVal secondNumber As Integer) Dim myAnswer As Integer Try
integer division \
myAnswer = firstNumber \ secondNumber Console.WriteLine("Answer " & myAnswer) 536
457591_20_CI_AS & A_Level_CS_498-540.indd 536
26/04/19 9:05 AM
Catch e As DivideByZeroException
20
Console.WriteLine("Divide by zero") End Try End Sub End Module
public class Division { public static void main(String[] args) { division(12, 3); division(10, 0); } public static void division(int firstNumber, int secondNumber){ int myAnswer; try { myAnswer = firstNumber / secondNumber; System.out.println("Answer " +
Automatic Integer division because there are integers on both sides of the division operator
myAnswer);
20.2 File processing and exception handling
Java
} catch (ArithmeticException e){ System.out.println("Divide by zero"); } } }
ACTIVITY 20N In the programming language of your choice, write a program to check that a value input is an integer.
ACTIVITY 20O In the programming language of your choice, extend the file handling programs you wrote in Section 20.2.1 to use exception handling to ensure that the files used exist and allow for the condition unexpected end of file.
537
457591_20_CI_AS & A_Level_CS_498-540.indd 537
26/04/19 9:05 AM
20 FURTHER PROGRAMMING
20
End of chapter questions
1 A declarative programming language is used to represent the following facts and rules: 01 male(ahmed). 02 male(raul). 03 male(ali). 04 male(philippe). 05 female(aisha). 06 female(gina). 07 female(meena). 08 parent(ahmed, raul). 09 parent(aisha, raul). 10 parent(ahmed, philippe). 11 parent(aisha, philippe). 12 parent(ahmed, gina). 13 parent(aisha, gina). 14 mother(A, B) IF female(A) AND parent(A, B). These clauses have the following meaning: Clause 01
Explanation
05
Aisha is female
08
Ahmed is a parent of Raul
14
A is the mother of B if A is female and A is a parent of B
Ahmed is male
a) More facts are to be included. Ali and Meena are the parents of Ahmed. Write the additional clauses to record this.
[2]
15 …………………………………………………………………………………………………………………
16 …………………………………………………………………………………………………………………
b) Using the variable C, the goal parent(ahmed, C) returns
C = raul, philippe, gina Write the result returned by the goal
[2]
parent(P, gina) P = …………………………………………………………………………………………………………………
c) Use the variable M to write the goal to find the mother of Gina.
[1]
538
457591_20_CI_AS & A_Level_CS_498-540.indd 538
26/04/19 9:05 AM
d) Write the rule to show that F is the father of C.
[2]
father(F, C) IF…………………………………………………………………………
e) Write the rule to show that X is a brother of Y.
20
[4]
brother(X, Y) IF…………………………………………………………………………
Cambridge International AS & A Level Computer Science 9608 Paper 42 Q2 November 2015 2 A college has two types of student: full-time and part-time. All students have their name and date of birth recorded. A full-time student has their address and telephone number recorded. A part-time student attends one or more courses. A fee is charged for each course. The number of courses a part-time student attends is recorded, along with the total fee and whether or not the fee has been paid. The college needs a program to process data about its students. The program will use an object-oriented programming language. a) Copy and complete the class diagram showing the appropriate properties and methods.[7]
20.2 File processing and exception handling
Student StudentName
: STRING
......................................... ......................................... ......................................... ShowStudentName () ......................................... ......................................... ......................................... FullTimeStudent Address
: STRING
PartTimeStudent ...............................
...............................
...............................
...............................
...............................
............................... Constructor ()
............................... ...............................
showAddress ()
...............................
...............................
...............................
...............................
...............................
b) Write program code: i) for the class definition for the superclass Student.[2] ➔ 539
457591_20_CI_AS & A_Level_CS_498-540.indd 539
26/04/19 9:05 AM
20 FURTHER PROGRAMMING
20
ii) for the class definition for the subclass FullTimeStudent.[3] iii) to create a new instance of FullTimeStudent with: – identifier: NewStudent – name: A. Nyone – date of birth: 12/11/1990 – telephone number: 099111 [3] Cambridge International AS & A Level Computer Science 9608 Paper 42 Q3 November 2015 3 a) When designing and writing program code, explain what is meant by: – an exception – exception handling. [3] b) A program is to be written to read a list of exam marks from an existing text file into a 1D array. Each line of the file stores the mark for one student. State three exceptions that a programmer should anticipate for this program. [3] c) The following pseudocode is to read two numbers. 01 DECLARE Num1 : INTEGER 02 DECLARE Num2 : INTEGER 03 DECLARE Answer : INTEGER 04 TRY 05 OUTPUT "First number..." 06 INPUT Num1 07 OUTPUT "Second number..." 08 INPUT Num2 09 Answer ← Num1 / (Num2 − 6) 10 OUTPUT Answer 11 EXCEPT ThisException : EXCEPTION 12 OUTPUT ThisException.Message 13 FINALLY 14 // remainder of the program follows
29 30 ENDTRY
The programmer writes the corresponding program code. A user inputs the number 53 followed by 6. The following output is produced: First number...53 Second number...6 Arithmetic operation resulted in an overflow
540
457591_20_CI_AS & A_Level_CS_498-540.indd 540
i) State the pseudocode line number which causes the exception to be raised. [1] ii) Explain the purpose of the pseudocode on lines 11 and 12. [3] Cambridge International AS & A Level Computer Science 9608 Paper 42 Q5(b)–(d) June 2016
26/04/19 9:05 AM
A* algorithm – an algorithm that finds the shortest route between nodes or vertices but uses an additional heuristic approach to achieve better performance than Dijkstra’s algorithm. Abnormal test data – test data that should be rejected by a program. Absolute addressing – mode of addressing in which the contents of the memory location in the operand are used. Abstract data type (ADT) – a collection of data and a set of operations on that data. Abstraction – the process of extracting information that is essential, while ignoring what is not relevant, for the provision of a solution. Acceptance testing – the testing of a completed program to prove to the customer that it works as required. Access rights (data security) – use of access levels to ensure only authorised users can gain access to certain data. Access rights (database) – the permissions given to database users to access, modify or delete data. Accumulator – temporary general purpose register which stores numerical values at any part of a given operation. Acknowledgement – message sent to a receiver to indicate that data has been received without error. ACM – Association for Computing Machinery. Adaptive maintenance – the alteration of a program to perform new tasks. Address bus – carries the addresses throughout the computer system.
Glossary
Glossary Array – a data structure containing several elements of the same data type. Artificial intelligence (AI) – machine or application which carries out a task that requires some degree of intelligence when carried out by a human counterpart. Artificial neural networks – networks of interconnected nodes based on the interconnections between neurons in the human brain; the system is able to think like a human using these neural networks, and its performance improves with more data. ASCII code – coding system for all the characters on a keyboard and control codes. Assembler – a computer program that translates programming code written in assembly language into machine code; assemblers can be one pass or two pass. Assembly language – a low-level chip/machine specific programming language that uses mnemonics. Asymmetric encryption – encryption that uses public keys (known to everyone) and private keys (secret keys). Asynchronous serial data transmission – serial refers to a single wire being used to transmit bits of data one after the other; asynchronous refers to a sender using its own clock/ timer device rather sharing the same clock/timer with the recipient device. Attribute (database) – an individual data item stored for an entity; for example, for a person, attributes could include name, address, date of birth. Attributes (class) – the data items in a class.
Addressing modes – different methods of using the operand part of a machine code instruction as a memory address.
Audio compression – method used to reduce the size of a sound file using perceptual music shaping.
Algorithm – an ordered set of steps to be followed in the completion of a task.
Authentication – a way of proving somebody or something is who or what they claim to be.
Alpha testing – the testing of a completed or nearly completed program in-house by the development team.
Automatic repeat request (ARQ) – a type of verification check.
Analogue to digital converter (ADC) – needed to convert analogue data (read from sensors, for example) into a form understood by a computer.
Back propagation – method used in artificial neural networks to calculate error gradients so that actual node/neuron weightings can be adjusted to improve the performance of the model.
Analysis – part of the program development lifecycle; a process of investigation, leading to the specification of what a program is required to do.
Back-up utility – software that makes copies of files on another portable storage device.
Anti-spyware software – software that detects and removes spyware programs installed illegally on a user’s computer system. Antivirus software – software that quarantines and deletes files or programs infected by a virus (or other malware); it can be run in the background or initiated by the user. Append – file access mode in which data can be added to the end of a file. Argument – the value passed to a procedure or function. Arithmetic shift – the sign of the number is preserved. Arithmetic-logic unit (ALU) – component in the processor which carries out all arithmetic and logical operations. ARPAnet – Advanced Research Projects Agency Network.
Backus-Naur form (BNF) notation – a formal method of defining the grammatical rules of a programming language. Bad sector – a faulty sector on an HDD which can be soft or hard. Base case – a terminating solution to a process that is not recursive. BCS – British Computer Society. Belady’s anomaly – phenomenon which means it is possible to have more page faults when increasing the number of page frames. Beta testing – the testing of a completed program by a small group of users before it is released. Bidirectional – used to describe a bus in which bits can travel in both directions. 541
457591_GLO_CI_AS & A_Level_CS_541-552.indd 541
25/04/19 2:17 PM
Glossary
Binary – base two number system based on the values 0 and 1 only.
Bubble sort – a method of sorting data in an array into alphabetical or numerical order by comparing adjacent items and swapping them if they are in the wrong order.
Binary coded decimal (BCD) – number system that uses 4 bits to represent each denary digit.
Buffering – store which holds data temporarily. Burst time – the time when a process has control of the CPU.
Binary file – a file that does not contain text only; the file is machine-readable but not human-readable.
Bus network topology – network using single central cable in which all devices are connected to this cable; data can only travel in one direction and only one device is allowed to transmit at a time.
Big O notation – a mathematical notation used to describe the performance or complexity of an algorithm.
Binary floating-point number – a binary number written in the form M × 2E (where M is the mantissa and E is the exponent). Binary search – a method of searching an ordered list by testing the value of the middle item in the list and rejecting the half of the list that does not contain the required value. Binary tree – a hierarchical data structure in which each parent node can have a maximum of two child nodes. Binder 3D printing – 3D printing method that uses a two-stage pass; the first stage uses dry powder and the second stage uses a binding agent. Biometrics – use of unique human characteristics to identify a user (such as fingerprints or face recognition). BIOS – basic input/output system. Birefringence – a reading problem with DVDs caused by refraction of laser light into two beams. Bit – abbreviation for binary digit. Bit depth – number of bits used to represent the smallest unit in, for example, a sound or image file; the larger the bit depth, the better the quality of the sound or colour image. Bit rate – number of bits per second that can be transmitted over a network; it is a measure of the data transfer rate over a digital telecoms network. Bit streaming – contiguous sequence of digital bits sent over a network/internet. Bit-map image – system that uses pixels to make up an image. BitTorrent – protocol used in peer-to-peer networks when sharing files between peers. Black-box testing – a method of testing a program that tests a module’s inputs and outputs. Block chaining – form of encryption, in which the previous block of ciphertext is XORed with the block of plaintext and then encrypted thus preventing identical plaintext blocks producing identical ciphertext. Block cipher – the encryption of a number of contiguous bits in one go rather than one bit at a time. Bluetooth – wireless connectivity that uses radio waves in the 2.45 GHz frequency band.
By reference – a method of passing a parameter to a procedure in which the value of the variable can be changed by the procedure. By value – a method of passing a parameter to a procedure in which the value of the variable cannot be changed by the procedure. Cache memory – a high speed auxiliary memory which permits high speed data transfer and retrieval. Candidate key – an attribute or smallest set of attributes in a table where no tuple has the same value. Capacitive – type of touch screen technology based on glass layers forming a capacitor; fingers touching the screen cause a change in the electric field. Certificate authority (CA) – commercial organisation used to generate a digital certificate requested by website owners or individuals. Character set – a list of characters that have been defined by computer hardware and software; it is necessary to have a method of coding, so that the computer can understand human characters. Chatbot – computer program set up to simulate conversational interaction between humans and a website. Check digit – additional digit appended to a number to check if entered data is error-free. Checksum – verification method used to check if data transferred has been altered or corrupted; calculated from the block of data to be sent. Ciphertext – the product when plaintext is put through an encryption algorithm. Circuit switching – method of transmission in which a dedicated circuit/channel lasts throughout the duration of the communication. CISC – complex instruction set computer. Class – a template defining the methods and data of a certain type of object.
Boolean algebra – a form of algebra linked to logic circuits and based on TRUE and FALSE.
Classless inter-domain routing (CIDR) – increases IPv4 flexibility by adding a suffix to the IP address, such as 200.21.100.6/18.
Bootstrap – a small program that is used to load other programs to ‘start up’ a computer.
CLI – command line interface.
Boundary test data – test data that is on the limit of that accepted by a program or data that is just outside the limit of that rejected by a program. Breakpoint – a deliberate pause in the execution of a program during testing so that the contents of variables, registers, and so on can be inspected to aid debugging. Bridge – device that connects LANs which use the same protocols. Broadcast – communication where pieces of data are sent from sender to receiver.
Client-server – network that uses separate dedicated servers and specific client work stations; all client computers are connected to the dedicated servers. Clock cycle – clock speeds are measured in terms of GHz; this is the vibrational frequency of the clock which sends out pulses along the control bus; a 3.5 GHZ clock cycle means 3.5 billion clock cycles a second. Close – file-processing operation; closes a file so it can no longer be used by a program.
542
457591_GLO_CI_AS & A_Level_CS_541-552.indd 542
25/04/19 2:17 PM
Data bus – allows data to be carried from processor to memory (and vice versa) or to and from input/output devices.
Cluster – a number of computers (containing SIMD processors) networked together.
Data definition language (DDL) – a language used to create, modify and remove the data structures that form a database.
CMOS – complementary metal-oxide semiconductor.
Data dictionary – a set of data that contains metadata (data about other data) for a database.
Coaxial cable – cable made up of central copper core, insulation, copper mesh and outer insulation. Code generation – the third stage in the process of compilation; this stage produces an object program. Coding – part of the program development lifecycle; the writing of the program or suite of programs. Collision – situation in which two messages/data from different sources are trying to transmit along the same data channel. Colour depth – number of bits used to represent the colours in a pixel, e.g. 8 bit colour depth can represent 28 = 256 Colours. Combination circuit – circuit in which the output depends entirely on the input values. Compiler – a computer program that translates a source program written in a high-level language to machine code or p-code, object code. Composite data type – a data type constructed using several of the basic data types available in a particular programming language. Composite key – a set of attributes that form a primary key to provide a unique identifier for a table. Conflict – situation in which two devices have the same IP address. Constant – a named value that cannot change during the execution of a program. Constructor – a method used to initialise a new object.
Data hiding – technique which protects the integrity of an object by restricting access to the data and methods within that object. Data integrity – the accuracy, completeness and consistency of data. Data management – the organisation and maintenance of data in a database to provide the information required. Data manipulation language (DML) – a language used to add, modify, delete and retrieve the data stored in a relational database. Data modelling – the analysis and definition of the data structures required in a database and to produce a data model. Data privacy – the privacy of personal information, or other information stored on a computer, that should not be accessed by unauthorised parties. Data protection laws – laws which govern how data should be kept private and secure. Data redundancy – situation in which the same data is stored on several servers in case of maintenance or repair. Data security – methods taken to prevent unauthorised access to data and to recover data if lost or corrupted.
Containment (aggregation) – process by which one class can contain other classes.
Data type – a classification attributed to an item of data, which determines the types of value it can take and how it can be used.
Context switching – procedure by which, when the next process takes control of the CPU, its previous state is reinstated or restored.
Database – a structured collection of items of data that can be accessed by different applications programs.
Contiguous – items next to each other.
Database management system (DBMS) – systems software for the definition, creation and manipulation of a database.
Control – to automatically take readings from a device, then use the data from those readings to adjust the device.
Debugging – the process of finding logic errors in a computer program by running or tracing the program.
Control bus – carries signals from control unit to all other computer components.
Declarative programming – statements of facts and rules together with a mechanism for setting goals in the form of a query.
Control unit – ensures synchronisation of data flow and programs throughout the computer by sending out control signals along the control bus. Core – a unit made up of ALU, control unit and registers which is part of a CPU; a CPU may contain a number of cores. Corrective maintenance – the correction of any errors that appear during use.
Decomposition – the process of breaking a complex problem into smaller parts. Deep learning – machines that think in a way similar to the human brain; they handle huge amounts of data using artificial neural networks.
Cross-coupling – interconnection between two logic gates which make up a flip-flop.
Design – part of the program development lifecycle; it uses the program specification from the analysis stage to show how the program should be developed.
CSMA/CD – carrier sense multiple access with collision detection; a method used to detect collisions and resolve the issue.
Destructor – a method that is automatically invoked when an object is destroyed.
Culture – the attitudes, values and practices shared by a group of people/society.
Developer interface – feature of a DBMS that provides developers with the commands required for definition, creation and manipulation of a database.
Current instruction register (CIR) – this is a register used to contain the instruction which is currently being executed or decoded. Cyclic shift – no bits are lost; bits shifted out of one end of the register are introduced at the other end of the register.
Glossary
Cloud storage – method of data storage where data is stored on off-site servers.
Device driver – software that communicates with the operating system and translates data into a format understood by the device.
543
457591_GLO_CI_AS & A_Level_CS_541-552.indd 543
4/30/19 8:05 AM
Glossary
Dictionary – an abstract data type that consists of pairs, a key and a value, in which the key is used to find the value.
Dynamic RAM (DRAM) – type of RAM chip that needs to be constantly refreshed.
Digest –a fixed-size numeric representation of the contents of a message produced from a hashing algorithm; this can be encrypted to form a digital signature.
Eavesdropper – a person who intercepts data being transmitted.
Digital certificate – an electronic document used to prove the identity of a website or individual; it contains a public key and information identifying the website owner or individual; issued by a CA. Digital rights management (DRM) – used to control the access to copyrighted material. Digital signature – electronic way of validating the authenticity of digital documents (that is, making sure they have not been tampered with during transmission) and also proof that a document was sent by a known user. Digital to analogue converter (DAC) – needed to convert digital data into electric currents that can drive motors, actuators and relays, for example. Dijkstra’s algorithm – an algorithm that finds the shortest path between two nodes or vertices in a graph/network. Direct 3D printing – 3D printing technique where print head moves in the x, y and z directions. Layers of melted material are built up using nozzles like an inkjet printer. Direct access – a method of file access in which a record can be physically found in a file without physically reading other records. Direct addressing – mode of addressing in which the contents of the memory location in the operand are used; same as absolute addressing. Direct memory access (DMA) controller – device that allows certain hardware to access RAM independently of the CPU. Dirty – term used to describe a page in memory that has been modified. Disk compression – software that compresses data before storage on an HDD. Disk content analysis software – utility that checks disk drives for empty space and disk usage by reviewing files and folders. Disk defragmenter – utility that reorganises the sectors on a hard disk so that files can be stored in contiguous data blocks. Disk formatter – utility that prepares a disk to allow data/files to be stored and retrieved. Disk thrashing – problem resulting from use of virtual memory; excessive swapping in and out of virtual memory leads to a high rate of hard disk read/write head movements thus reducing processing speed. DNS cache poisoning – altering IP addresses on a DNS server by a ‘pharmer’ or hacker with the intention of redirecting a user to their fake website.
Electronically erasable programmable read-only memory (EEPROM) – a read-only (ROM) chip that can be modified by the user which can then be erased and written to repeatedly using pulsed voltages. Emulation – the use of an app/device to imitate the behaviour of another program/device; for example, running an OS on a computer which is not normally compatible. Encapsulation – process of putting data and methods together as a single unit, a class. Encryption – the use of encryption keys to make data meaningless without the correct decryption key. Entity – anything that can have data stored about it; for example, a person, place, event, thing. Entity-relationship (E-R) model or E-R diagram – a graphical representation of a database and the relationships between the entities. Enumerated data type – a non-composite data type defined by a given list of all possible values that has an implied order. Erasable PROM (EPROM) – type of ROM that can be programmed more than once using ultraviolet (UV) light. Ethernet – protocol IEEE 802.3 used by many wired LANs. Ethical hacking – hacking used to test the security and vulnerability of a computer system; the hacking is carried out with the permission of the computer system owner, for example, to help a company identify risks associated with malicious hacking of their computer systems. Ethics – moral principles governing an individual’s or organisation’s behaviour, such as a code of conduct. Even parity – binary number with an even number of 1-bits. Exception – an unexpected event that disrupts the execution of a program. Exception handling – the process of responding to an exception within the program so that the program does not halt unexpectedly. Exponent – the power of 2 that the mantissa (fractional part) is raised to in a floating-point number. Extreme test data – test data that is on the limit of that accepted by a program. Fact – a ‘thing’ that is known. False positive – a file or program identified by a virus checker as being infected but the user knows this cannot be correct. Fetch-execute cycle – a cycle in which instructions and data are fetched from memory and then decoded and finally executed. Fibre optic cable – cable made up of glass fibre wires which use pulses of light (rather than electricity) to transmit data.
Domain name service (DNS) – (also known as domain name system) gives domain names for internet hosts and is a system for finding IP addresses of a domain name.
Field – a column in a table in a database.
Dry run – a method of testing a program that involves working through a program or module from a program manually.
File access – the method used to physically find a record in the file.
Dual core – a CPU containing two cores. Dual layering – used in DVDs; uses two recording layers. Dynamic link file (DLL) – a library routine that can be linked to another program only at the run time stage.
File – a collection of data stored by a computer program to be used again.
File organisation – the way that records of data are physically stored in a file, including the structure and ordering of the records.
544
457591_GLO_CI_AS & A_Level_CS_541-552.indd 544
25/04/19 2:17 PM
Handshake – the process of initiating communication between two devices; this is initiated by one device sending a message to another device requesting the exchange of data.
Finite state machine (FSM) – a mathematical model of a machine that can be in one state of a fixed set of possible states; one state is changed to another by an external input; this is known as a transition.
Hard disk drive (HDD) – type of magnetic storage device that uses spinning disks.
Firewall – software or hardware that sits between a computer and external network which monitors and filters all incoming and outgoing activities. First in first out (FIFO) page replacement – page replacement that keeps track of all pages in memory using a queue structure; the oldest page is at the front of the queue and is the first to be removed when a new page is added. First normal form (1NF) – the status of a relational database in which entities do not contain repeated groups of attributes. Flag – indicates the status of a bit in the status register; for example, N = 1 indicates the result of an addition gives a negative value. Flash memory – a type of EEPROM, particularly suited to use in drives such as SSDs, memory cards and memory sticks. Flip-flop circuits – electronic circuits with two stable conditions using sequential circuits. Flowchart – a diagrammatic representation of an algorithm. Foreign key – a set of attributes in one table that refer to the primary key in another table. Fragmented – storage of data in non-consecutive sectors; for example, due to editing and deletion of old data. Frame rate – number of video frames that make up a video per second. Frames – fixed-size physical memory blocks. Free Software Foundation – organisation promoting the free distribution of software, giving users the freedom to run, copy, change or adapt the coding as needed. Freeware – software that can be downloaded free of charge; however, it is covered by the usual copyright laws and cannot be modified; nor can the code be used for another purpose. FTP –file transfer protocol. Full adder circuit – two half adders combined to allow the sum of several binary bits.
Hardware management – part of the operating system that controls all input/output devices connected to a computer (made up of sub-management systems such as printer management, secondary storage management, and so on). Hashing algorithm (cryptography) – a function which converts a data string into a numeric string which is used in cryptography. Hashing algorithm (file access) – a mathematical formula used to perform a calculation on the key field of the record; the result of the calculation gives the address where the record should be found. HCI – human–computer interface. Header (procedure or function) – the first statement in the definition of a procedure or function, which contains its name, any parameters passed to it, and, for a function, the type of the return value. Header (data packet) – part of a data packet containing key data such as destination IP address, sequence number, and so on. Heuristic – method that employs a practical solution (rather than a theoretical one) to a problem; when applied to algorithms this includes running tests and obtaining results by trial and error. Heuristic checking – checking of software for behaviour that could indicate a possible virus. Hexadecimal – a number system based on the value 16 (uses the denary digits 0 to 9 and the letters A to F). High-bandwidth digital copy protection (HDCP) – part of HDMI technology which reduces risk of piracy of software and multimedia. High-definition multimedia interface (HDMI) – type of port connecting devices to a computer. Hop number/hopping – number in the packet header used to stop packets which never reach their destination from ‘clogging up’ routes. Host – a computer or device that can communicate with other computers or devices on a network.
Function – a set of statements that can be grouped together and easily called in a program whenever required, rather than repeating all of the statements each time. Unlike a procedure, a function always returns a value.
Host OS – an OS that controls the physical hardware.
Gateway – device that connects LANs which use different protocols.
HTTP – hypertext transfer protocol.
General case – a solution to a process that is recursively defined. Getter – a method that gets the value of a property. Graph – a non-linear data structure consisting of nodes and edges. Gray codes – ordering of binary numbers such that successive numbers differ by one bit value only, for example, 00 01 11 10. Guest OS – an OS running on a virtual machine. GUI – graphical user interface. Hacking – illegal access to a computer system without the owner’s permission. Half adder circuit – carries out binary addition on two bits giving sum and carry.
Glossary
File server – a server on a network where central files and other data are stored; they can be accessed by a user logged onto the network.
Host-to-host – a protocol used by TCP when communicating between two devices. Hub – hardware used to connect together a number of devices to form a LAN; directs incoming data packets to all devices on the network (LAN). Hybrid network – network made up of a combination of other network topologies. HyperText Mark-up Language (HTML) – used to design web pages and to write http(s) protocols, for example. Hypervisor – virtual machine software that creates and runs virtual machines. Icon – small picture or symbol used to represent, for example, an application on a screen. Identifier – a unique name applied to an item of data. 545
457591_GLO_CI_AS & A_Level_CS_541-552.indd 545
25/04/19 2:17 PM
Glossary
IEEE – Institute of Electrical and Electronics Engineers. Image resolution – number of pixels that make up an image; for example, an image could contain 4096 × 3192 pixels (13 074 432 pixels in total). IMAP – internet message access protocol. Immediate access store (IAS) – holds all data and programs needed to be accessed by the control unit. Immediate addressing – mode of addressing in which the value of the operand only is used. Imperative programming – programming paradigm in which the steps required to execute a program are set out in the order they need to be carried out. In demand paging – a form of data swapping where pages of data are not copied from HDD/SSD into RAM until they are actually required. Index (database) – a data structure built from one or more columns in a database table to speed up searching for data. Index (array) – a numerical indicator of an item of data’s position in an array. Indexed addressing – mode of addressing in which the contents of the memory location found by adding the contents of the index register (IR) to the address of the memory location in the operand are used. Indirect addressing – mode of addressing in which the contents of the contents of the memory location in the operand are used. Inheritance – process in which the methods and data from one class, a superclass or base class, are copied to another class, a derived class. Insertion sort – a method of sorting data in an array into alphabetical or numerical order by placing each item in turn in the correct position in the sorted list. Instance – An occurrence of an object during the execution of a program. Instruction – a single operation performed by a CPU. Instruction set – the complete set of machine code instructions used by a CPU. Integrated development environment (IDE) – a suite of programs used to write and test a computer program written in a high-level programming language. Integration testing – a method of testing a program that tests combinations of program modules that work together. Intellectual property rights – rules governing an individual’s ownership of their own creations or ideas, prohibiting the copying of, for example, software without the owner’s permission. Internet – massive network of networks, made up of computers and other electronic devices; uses TCP/IP communication protocols. Internet protocol (IP) – uses IPv4 or IPv6 to give addresses to devices connected to the internet. Internet service provider (ISP) – company which allows a user to connect to the internet; they will usually charge a monthly fee for the service they provide. Interpreter – a computer program that analyses and executes a program written in a high-level language line by line.
Interrupt – signal sent from a device or software to a processor requesting its attention; the processor suspends all operations until the interrupt has been serviced. Interrupt dispatch table (IDT) – data structure used to implement an interrupt vector table. Interrupt priority – all interrupts are given a priority so that the processor knows which need to be serviced first and which interrupts are to be dealt with quickly. Interrupt priority levels (IPL) – values given to interrupts based on values 0 to 31. Interrupt service routine (ISR) or interrupt handler – software which handles interrupt requests (such as ‘printer out of paper’) and sends the request to the CPU for processing. IPv4 – IP address format which uses 32 bits, such as 200.21.100.6. IPv6 – newer IP address format which uses 128 bits, such as A8F0:7FFF:F0F1:F000:3DD0: 256A:22FF:AA00. Iterative model – a type of program development cycle in which a simple subset of the requirements is developed, then expanded or enhanced, with the development cycle being repeated until the full system has been developed. JavaScript – object-orientated (or scripting) programming language used mainly on the web; used to enhance HTML pages. JPEG – Joint Photographic Expert Group; a form of lossy file compression based on the inability of the eye to spot certain colour changes and hues. Karnaugh maps (K-maps) – a method used to simplify logic statements and logic circuits; uses Gray codes. Kernel – the core of an OS with control over process management, memory management, interrupt handling, device management and I/O operations. Key distribution problem – security issue inherent in symmetric encryption arising from the fact that, when sending the secret key to a recipient, there is the risk that the key can be intercepted by an eavesdropper/hacker. Labelled data – data where we know the target answer and the data object is fully recognised. LAN – local area network (network covering a small area such as a single building). Latency – the lag in a system; for example, the time to find a track on a hard disk, which depends on the time taken for the disk to rotate around to its read-write head. Least recently used (LRU) page replacement – page replacement algorithm in which the page which has not been used for the longest time is replaced. Leech – a peer with negative feedback from swarm members. Left shift – bits are shifted to the left. Legal – relating to, or permissible by, law. Lexical analysis – the first stage in the process of compilation; removes unnecessary characters and tokenises the program. Library program – a program stored in a library for future use by other programmers. Library routine – a tested and ready-to-use routine available in the development system of a programming language that can be incorporated into a program. Linear search – a method of searching in which each element of an array is checked in order.
546
457591_GLO_CI_AS & A_Level_CS_541-552.indd 546
25/04/19 2:17 PM
Memory organisation – function of memory management that determines how much memory is allocated to an application.
Logic circuit – formed from a combination of logic gates and designed to carry out a particular task; the output from a logic circuit will be 0 or 1.
Memory protection – function of memory management that ensures two competing applications cannot use same memory locations at the same time.
Logic error – an error in the logic of a program.
Mesh network topology – interlinked computers/devices, which use routing logic so data packets are sent from sending stations to receiving stations only by the shortest route.
Logic gates – electronic circuits which rely on ‘on/off’ logic; the most common ones are NOT, AND, OR, NAND, NOR and XOR. Logical memory – the address space that an OS perceives to be main storage. Logical schema – a data model for a specific database that is independent of the DBMS used to build that database. Logical shift – bits shifted out of the register are replaced with zeros. Lossless file compression – file compression method where the original file can be restored following decompression. Lossy file compression – file compression method where parts of the original file cannot be recovered during decompression; some of the original detail is lost. Low level scheduling – method by which a system assigns a processor to a task or process based on the priority level. Lower bound – the index of the first element in an array, usually 0 or 1. Low-level programming – programming instructions that use the computer’s basic instruction set. Lurker – user/client that downloads files but does not supply any new content to the community. Machine code – the programming language that the CPU uses. Machine learning – systems that learn without being programmed to learn. Maintenance – part of the program development lifecycle; the process of making sure that the program continues to work during use. Malicious hacking – hacking done with the sole intent of causing harm to a computer system or user (for example, deletion of files or use of private data to the hacker’s advantage). Malware – malicious software that seeks to damage or gain unauthorised access to a computer system. MAN – metropolitan area network (network which is larger than a LAN but smaller than a WAN; can cover several buildings in a single city, such as a university campus). Mantissa – the fractional part of a floating point number. Mask – a number that is used with the logical operators AND, OR or XOR to identify, remove or set a single bit or group of bits in an address or register. Massively parallel computers – the linking together of several computers effectively forming one machine with thousands of processors. Memory cache – high speed memory external to processor which stores data which the processor will need again. Memory dump – contents of a computer memory output to screen or printer.
Glossary
Linked list – a list containing several items in which each item in the list points to the next item in the list.
Metadata – a set of data that describes and gives information about other data. Method – a programmed procedure that is defined as part of a class. MIMD – multiple instruction multiple data, computer architecture which uses many processors, each of which can use a separate data source. MIME – multi-purpose internet mail extension; a protocol that allows email attachments containing media files as well as text to be sent. MISD – multiple instruction single data, computer architecture which uses many processors but the same shared data source. Modem – modulator demodulator; device which converts digital data to analogue data (to be sent down a telephone wire); conversely it also converts analogue data to digital data (which a computer can process). Modulo-11 – method used to calculate a check digit based on modulus division by 11. Monitor – to automatically take readings from a device. Morality – an understanding of the difference between right and wrong, often founded in personal beliefs. MP3/MP4 files – file compression method used for music and multimedia files. Multitasking – function allowing a computer to process more than one task/process at a time. NIC – network interface card; these cards allow devices to connect to a network/internet (usually associated with a MAC address set at the factory). Node – device connected to a network (it can be a computer, storage device or peripheral device). Node or vertex – fundamental unit from which graphs are formed (nodes and vertices are the points where edges converge). Non-composite data type – a data type that does not reference any other data types. Non-preemptive – type of scheduling in which a process terminates or switches from a running state to a waiting state. Normal test data – test data that should be accepted by a program. Normalisation (database) – the process of organising data to be stored in a database into two or more tables and relationships between the tables, so that data redundancy is minimised.
Memory management – part of the operating system that controls the main memory.
Normalisation (floating-point) – a method to improve the precision of binary floating-point numbers; positive numbers should be in the format 0.1 and negative numbers in the format 1.0.
Memory optimisation – function of memory management that determines how memory is allocated and deallocated.
Object – an instance of a class that is self-contained and includes data and methods. 547
457591_GLO_CI_AS & A_Level_CS_541-552.indd 547
25/04/19 2:17 PM
Glossary
Object code – a computer program after translation into machine code. Object-oriented programming (OOP) – a programming methodology that uses self-contained objects, which contain programming statements (methods) and data, and which communicate with each other. Odd parity – binary number with an odd number of 1-bits. On demand (bit streaming) – system that allows users to stream video or music files from a central server as and when required without having to save the files on their own computer/tablet/phone.
Pages – fixed-size logical memory blocks. Paging – form of memory management which divides up physical memory and logical memory into fixed-size memory blocks. PAN – network that is centred around a person or their workspace. Parallel processing – operation which allows a process to be split up and for each part to be executed by a different processor at the same time. Parameter – a variable applied to a procedure or function that allows one to pass in a value for the procedure to use.
One’s complement – each binary digit in a number is reversed to allow both negative and positive numbers to be represented.
Parity bit – an extra bit found at the end of a byte that is set to 1 if the parity of the byte needs to change to agree with sender/receiver parity protocol.
Opcode – short for operation code, the part of a machine code instruction that identifies the action the CPU will perform.
Parity block – horizontal and vertical parity check on a block of data being transferred.
Open – file-processing operation; opens a file ready to be used in a program.
Parity byte – additional byte sent with transmitted data to enable vertical parity checking (as well as horizontal parity checking) to be carried out.
Open Source Initiative – organisation offering the same freedoms as the Free Software Foundation, but with more of a focus on the practical consequences of the four shared rules, such as more collaborative software development. Operand – the part of a machine code instruction that identifies the data to be used by the CPU. Operating system – software that provides an environment in which applications can run and provides an interface between hardware and human operators. Optical storage – CDs, DVDs and Blu-ray® discs that use laser light to read and write data. Optimal page replacement – page replacement algorithm that looks forward in time to see which frame to replace in the event of a page fault. Optimisation (compilation) – the fourth stage in the process of compilation; the creation of an efficient object program. Optimisation (memory management) – function of memory management deciding which processes should be in main memory and where they should be stored. Organic LED (OLED) – uses movement of electrons between cathode and anode to produce an on-screen image; generates its own light so no back lighting required. Overclocking – changing the clock speed of a system clock to a value higher than the factory/recommended setting. Overflow – the result of carrying out a calculation which produces a value too large for the computer’s allocated word size. Overloading – feature of object-oriented programming that allows a method to be defined more than once in a class, so it can be used in different situations. Packet – a message/data is split up into smaller groups of bits for transmission over a network. Packet switching – method of transmission where a message is broken into packets which can be sent along paths independently from each other. Page fault – occurs when a new page is referred but is not yet in memory. Page replacement – occurs when a requested page is not in memory and a free page cannot be used to satisfy allocation. Page table – table that maps logical addresses to physical addresses; it contains page number, flag status, frame address and time of entry.
Parity check – method used to check if data has been transferred correctly; uses even or odd parity. Pattern recognition – the identification of parts of a problem that are similar and could use the same solution. Peer – a client who is part of a peer-to-peer network/file sharing community. Peer-to-peer – network in which each node can share its files with all the other nodes; each node has its own data; there is no central server. Perceptual music shaping – method where sounds outside the normal range of hearing of humans, for example, are eliminated from the music file during compression. Perfective maintenance – the process of making improvements to the performance of a program. Pharming – redirecting a user to a fake website in order to illegally obtain personal data about the user. Phishing – legitimate-looking emails designed to trick a recipient into giving their personal data to the sender of the email. PHP – hypertext processor; an HTML-embedded scripting language used to write web pages. Physical memory – main/primary RAM memory. Pieces – splitting up of a file when using peer-to-peer file sharing. Pinching and rotating – actions by fingers on a touch screen to carry out tasks such as move, enlarge, reduce, and so on. Pipelining – allows several instructions to be processed simultaneously without having to wait for previous instructions to finish. Piracy – the practice of using or making illegal copies of, for example, software. Pixel – smallest picture element that makes up an image. Pixel density – number of pixels per square centimetre. Plagiarism – the act of taking another person’s work and claiming it as one’s own. Plaintext – the original text/document/message before it is put through an encryption algorithm. Pointer data type – a non-composite data type that uses the memory address of where the data is stored.
548
457591_GLO_CI_AS & A_Level_CS_541-552.indd 548
25/04/19 2:17 PM
Public IP address – an IP address allocated by the user’s ISP to identify the location of their device on the internet.
POP – post office protocol.
Public key – encryption/decryption key known to all users.
Port – external connection to a computer which allows it to communicate with various peripheral devices; a number of different port technologies exist.
Public key infrastructure (PKI) – a set of protocols, standards and services that allow users to authenticate each other using digital certificates issued by a CA.
Positive feedback – the output from a process which influences the next input value to the process.
Public switched telephone network (PSTN) – network used by traditional telephones when making calls or when sending faxes.
Post-WIMP – interfaces that go beyond WIMP and use touch screen technology rather than a pointing device. Preemptive – type of scheduling in which a process switches from running state to steady state or from waiting state to steady state. Prettyprinting – the practice of displaying or printing well set out and formatted source code, making it easier to read and understand. Primary key – a unique identifier for a table, it is a special case of a candidate key.
Pull protocol – protocol used when receiving emails, in which the client periodically connects to a server, checks for and downloads new emails from a server and then closes the connection. Push protocol – protocol used when sending emails, in which the client opens the connection to the server and keeps the connection active all the time, then uploads new emails to the server. Quad core – a CPU containing four cores.
Privacy – the right to keep personal information and data secret, and for it to not be unwillingly accessed or shared through, for example, hacking.
Quantum – a fixed time slice allocated to a process.
Private IP address – an IP address reserved for internal network use behind a router.
Quantum key distribution (QKD) – protocol which uses quantum mechanics to securely send encryption keys over fibre optic networks.
Private key – encryption/decryption key which is known only to a single user/computer. Procedure – a set of statements that can be grouped together and easily called in a program whenever required, rather than repeating all of the statements each time. Process – a program that has started to be executed. Process control block (PCB) – data structure which contains all the data needed for a process to run. Process management – part of the operating system that involves allocation of resources and permits the sharing and exchange of data. Process states – running, ready and blocked; the states of a process requiring execution. Product key – security method used in software to protect against illegal copies or use. Program counter (PC) – a register used in a computer to store the address of the instruction which is currently being executed. Program development lifecycle – the process of developing a program set out in five stages: analysis, design, coding, testing and maintenance. Program library – a library on a computer where programs and routines are stored which can be freely accessed by other software developers for use in their own programs. Programmable ROM (PROM) – type of ROM chip that can be programmed once. Programming paradigm – a set of programming concepts. Property – data and methods within an object that perform a named action. Protocol – a set of rules governing communication across a network; the rules are agreed by both sender and recipient. Pseudocode – a method of showing the detailed logical steps in an algorithm, using keywords, identifiers with meaningful names, and mathematical operators.
Glossary
Polymorphism – feature of object-oriented programming that allows methods to be redefined for derived classes.
Quantum cryptography – cryptography based on the laws of quantum mechanics (the properties of photons).
Quarantine – file or program identified as being infected by a virus which has been isolated by antivirus software before it is deleted at a later stage. Qubit – the basic unit of a quantum of information (quantum bit). Query processor – feature of a DBMS that processes and executes queries written in structured query language (SQL). Queue – a list containing several items operating on the first in, first out (FIFO) principle. Random access memory (RAM) – primary memory unit that can be written to and read from. Random file organisation – a method of file organisation in which records of data are physically stored in a file in any available position; the location of any record in the file is found by using a hashing algorithm on the key field of a record. Rapid application development (RAD) – a type of program development cycle in which different parts of the requirements are developed in parallel, using prototyping to provide early user involvement in testing. Read – file access mode in which data can be read from a file. Read-only memory (ROM) – primary memory unit that can only be read from. Real-time (bit streaming) – system in which an event is captured by camera (and microphone) connected to a computer and sent to a server where the data is encoded; the user can access the data ‘as it happens’ live. Record (database) – a row in a table in a database. Record (data type) – a composite data type comprising several related items that may be of different data types. Recursion – a process using a function or procedure that is defined in terms of itself and calls itself. Referential integrity – property of a database that does not contain any values of a foreign key that are not matched to the corresponding primary key. 549
457591_GLO_CI_AS & A_Level_CS_541-552.indd 549
25/04/19 2:17 PM
Glossary
Refreshed – requirement to charge a component to retain its electronic state.
Run-time error – an error found in a program when it is executed; the program may halt unexpectedly.
Register – temporary component in the processor which can be general or specific in its use; holds data or instructions as part of the fetch-execute cycle.
Sampling rate – number of sound samples taken per second.
Register Transfer Notation (RTN) – short hand notation to show movement of data and instructions in a processor, can be used to represent the operation of the fetch-execute cycle.
Scheduling – process manager which handles the removal of running programs from the CPU and the selection of new processes.
Regression – statistical measure used to make predictions from data by finding learning relationships between the inputs and outputs. Reinforcement learning – system which is given no training; learns on basis of ‘reward and punishment’. Relational database – a database where the data items are linked by internal pointers. Relationship – situation in which one table in a database has a foreign key that refers to a primary key in another table in the database. Relative addressing – mode of addressing in which the memory address used is the current memory address added to the operand. Removable hard disk drive – portable hard disk drive that is external to the computer; it can be connected via a USB port when required; often used as a device to back up files and data. Repeater – device used to boost a signal on both wired and wireless networks. Repeating hubs – network devices which are a hybrid of hub and repeater unit. Report window – a separate window in the runtime environment of the IDE that shows the contents of variables during the execution of a program. Resistive – type of touch screen technology; when a finger touches the screen, the glass layer touches the plastic layer, completing the circuit and causing a current to flow at that point. Resolution – number of pixels per column and per row on a monitor or television screen. Reverse Polish notation (RPN) – a method of representing an arithmetical expression without the use of brackets or special punctuation. Reward and punishment – improvements to a model based on whether feedback is positive or negative; actions are optimised to receive an increase in positive feedback. Right shift – bits are shifted to the right. RISC – reduced instruction set computer.
Sampling resolution/bit depth – number of bits used to represent sound amplitude.
Screen resolution – number of horizontal and vertical pixels that make up a screen display; if the screen resolution is smaller than the image resolution, the whole image cannot be shown on the screen, or the original image will become lower quality. Second normal form (2NF) – the status of a relational database in which entities are in 1NF and any non-key attributes depend upon the primary key. Secondary key – a candidate key that is an alternative to the primary key. Secure Sockets Layer (SSL) – security protocol used when sending data over the internet. Security management – part of the operating system that ensures the integrity, confidentiality and availability of data. Seed – a peer that has downloaded a file (or pieces of a file) and has then made it available to other peers in the swarm. Segment (transport layer) – this is a unit of data (packet) associated with the transport layer protocols. Segment map table – table containing the segment number, segment size and corresponding memory location in physical memory; it maps logical memory segments to physical memory. Segment number – index number of a segment. Segments (memory) – variable-size memory blocks into which logical memory is split up. Semi-supervised (active) learning – system that interactively queries source data to reach the desired result; it uses both labelled and unlabelled data, but mainly unlabelled data on cost grounds. Sensor – input device that reads physical data from its surroundings. Sequential access – a method of file access in which records are searched one after another from the physical start of the file until the required record is found. Sequential circuit – circuit in which the output depends on input values produced from previous output values. Sequential file organisation – a method of file organisation in which records of data are physically stored in a file, one after another, in a given order.
Round robin (scheduling) – scheduling algorithm that uses time slices assigned to each process in a job queue.
Serial access – a method of file access in which records are searched one after another from the physical start of the file until the required record is found.
Router – device which enables data packets to be routed between different networks (for example, can join LANs to form a WAN).
Serial file organisation – a method of file organisation in which records of data are physically stored in a file, one after another, in the order they were added to the file.
Routing table – a data table that contains the information necessary to forward a package along the shortest or best route to allow it to reach its destination. Rules – relationships between facts.
Session caching – function in TLS that allows a previous computer session to be ‘remembered’, therefore preventing the need to establish a new link each time a new session is attempted.
Run length encoding (RLE) – a lossless file compression technique used to reduce text and photo files in particular.
Set – a given list of unordered elements that can use set theory operations such as intersection and union.
550
457591_GLO_CI_AS & A_Level_CS_541-552.indd 550
25/04/19 2:17 PM
Shareware – software that is free of charge initially (free trial period); the full version of the software can only be downloaded once the full fee for the software has been paid. Shift – moving the bits stored in a register a given number of places within the register; there are different types of shift.
Structured query language (SQL) – the standard query language used with relational databases for data definition and data modification. State-transition table – a table showing every state of a finite state machine (FSM), each possible input and the state after the input.
Sign and magnitude – binary number system where left-most bit is used to represent the sign (0 = + and 1 = –); the remaining bits represent the binary value.
Stub testing – the use of dummy modules for testing purposes.
SIMD – single instruction multiple data, computer architecture which uses many processors and different data inputs.
Sum of products (SoP) – a Boolean expression containing AND and OR terms.
Single stepping – the practice of running a program one line/ instruction at a time.
Super computer – a powerful mainframe computer.
SISD – single instruction single data, computer architecture which uses a single processor and one data source. SMTP – simple mail transfer protocol. Softmodem – abbreviation for software modem; a softwarebased modem that uses minimal hardware. Solid state drive (SSD) – storage media with no moving parts that relies on movement of electrons. Source code – a computer program before translation into machine code. Spread spectrum frequency hopping – a method of transmitting radio signals in which a device picks one of 79 channels at random. If the chosen channel is already in use, it randomly chooses another channel. It has a range up to 100 metres.
Sub-netting – practice of dividing networks into two or more sub-networks.
Supervised learning – system which is able to predict future outcomes based on past data; it requires both input and output values to be used in the training process. Swap space – space on HDD used in virtual memory, which saves process data. Swarm – connected peers (clients) that share a torrent/tracker. Switch – hardware used to connect together a number of devices to form a LAN; directs incoming data packets to a specific destination address only. Symbolic addressing – mode of addressing used in assembly language programming; a label is used instead of a value. Symmetric encryption – encryption in which the same secret key is used to encrypt and decrypt messages.
Spread spectrum technology – wideband radio frequency with a range of 30 to 50 metres.
Syntax analysis – the second stage in the process of compilation; output from the lexical analysis is checked for grammatical (syntax) errors.
SQL script – a list of SQL commands that perform a given task, often stored in a file for reuse.
Syntax diagram – a graphical method of defining and showing the grammatical rules of a programming language.
Stack – a list containing several items operating on the last in, first out (LIFO) principle.
Syntax error – an error in the grammar of a source program.
Star network topology – a network that uses a central hub/ switch with all devices connected to this central hub/switch; all data packets are directed through this central hub/switch. Starve – to constantly deprive a process of the necessary resources to carry out a task/process. State-transition diagram – a diagram showing the behaviour of a finite state machine (FSM). State-transition table – a table showing every state of a finite state machine (FSM), each possible input and the state after the input. Static RAM (SRAM) – type of RAM chip that uses flip-flops and does not need refreshing. Status register – used when an instruction requires some form of arithmetic or logical processing. Stepwise refinement – the practice of subdividing each part of a larger problem into a series of smaller parts, and so on, as required. Stream cipher – the encryption of bits in sequence as they arrive at the encryption algorithm. Structure chart – a modelling tool used to decompose a problem into a set of sub-tasks. It shows the hierarchy or structure of the different modules and how they connect and interact with each other. Structured English – a method of showing the logical steps in an algorithm, using an agreed subset of straightforward English words for commands and mathematical operations.
Glossary
Setter – a method used to control changes to a variable.
System clock – produces timing signals on the control bus to ensure synchronisation takes place. Table – a group of similar data, in a database, with rows for each instance of an entity and columns for each attribute. TCP – transmission control protocol. Test plan – a detailed list showing all the stages of testing and every test that will be performed for a particular program. Test strategy – an overview of the testing required to meet the requirements specified for a particular program; it shows how and when the program is to be tested. Testing – part of the program development lifecycle; the testing of the program to make sure that it works under all conditions. Thick client – device which can work both off line and on line and is able to do some processing even if not connected to a network/internet. Thin client – device that needs access to the internet for it to work; it depends on a more powerful computer for processing. Third normal form (3NF) – the status of a relational database in which entities are in 2NF and all non-key attributes are independent. Thrash point – point at which the execution of a process comes to a halt since the system is busier paging in/out of memory rather than actually executing them. Timeout – time allowed to elapse before an acknowledgement is received. 551
457591_GLO_CI_AS & A_Level_CS_541-552.indd 551
4/30/19 8:06 AM
Glossary
Touch screen – screen on which the touch of a finger or stylus allows selection or manipulation of a screen image; they usually use capacitive or resistive technology.
Vector graphics – images that use 2D points to describe lines and curves and their properties that are grouped to form geometric shapes.
Trace table – a table showing the process of dry-running a program with columns showing the values of each variable as it changes.
Verification – method used to ensure data is correct by using double entry or visual checks.
Tracker – central server that stores details of all other computers in the swarm. Translation lookaside buffer (TLB) – this is a memory cache which can reduce the time taken to access a user memory location; it is part of the memory management unit. Translator – the systems software used to translate a source program written in any language other than machine code. Transport Layer Security (TLS) – a more up-to-date version of SSL. Truth table – a method of checking the output from a logic circuit; they use all the possible binary input combinations depending on the number of inputs; for example, 2 inputs have 22 (4) possible binary combinations, 3 inputs will have 23 (8) possible binary combinations, and so on. Tuple – one instance of an entity, which is represented by a row in a table. Twisted pair cable – type of cable in which two wires of a single circuit are twisted together; several twisted pairs make up a single cable. Two’s complement – each binary digit is reversed and 1 is added in right-most position to produce another method of representing positive and negative numbers. Underflow – the result of carrying out a calculation which produces a value too small for the computer’s allocated word size. Unicode – coding system which represents all the languages of the world (first 128 characters are the same as ASCII code). Unidirectional – used to describe a bus in which bits can travel in one direction only. Uniform resource locator (URL) – specifies location of a web page (for example, www.hoddereducation.co.uk). Universal Serial Bus (USB) – a type of port connecting devices to a computer. Unlabelled data – data where objects are undefined and need to be manually recognised. Unsupervised learning – system which is able to identify hidden patterns from input data; the system is not trained on the ‘right’ answer. Unwinding – process which occurs when a recursive function finds the base case and the function returns the values.
Video Graphics Array (VGA) – type of port connecting devices to a computer. Virtual machine – an emulation of an existing computer system; a computer OS running within another computer’s OS. Virtual memory – type of paging that gives the illusion of unlimited memory being available. Virtual memory systems – memory management (part of OS) that makes use of hardware and software to enable a computer to compensate for shortage of actual physical memory. Virtual reality headset – apparatus worn on the head that covers the eyes like a pair of goggles; it gives the user the ‘feeling of being there’ by immersing them totally in the virtual reality experience. Voice over Internet Protocol (VoIP) – converts voice and webcam images into digital packages to be sent over the internet. Von Neumann architecture – computer architecture which introduced the concept of the stored program in the 1940s. Walkthrough – a method of testing a program; a formal version of a dry run using pre-defined test cases. WAN – wide area network (network covering a very large geographical area). (W)AP – (wireless) access point which allows a device to access a LAN without a wired connection. Waterfall model – a linear sequential program development cycle, in which each stage is completed before the next is begun. Web browser – software that connects to DNS to locate IP addresses; interprets web pages sent to a user’s computer so that documents and multimedia can be read or watched/listened to. Web crawler – internet bot that systematically browses the world wide web to update its web page content. White-box testing – a method of testing a program that tests the structure and logic of every path through a program module. Wi-Fi – wireless connectivity that uses radio waves, microwaves. WIMP – windows, icons, menu and pointing device. Winding – process which occurs when a recursive function or procedure is called until the base case is found. WLAN – wireless LAN. WNIC – wireless network interface cards/controllers.
Upper bound – the index of the last element in an array.
Word – group of bits used by a computer to represent a single unit.
User account – an agreement that allows an individual to use a computer or network server, often requiring a user name and password.
World Wide Web (WWW) – collection of multimedia web pages stored on a website; uses the internet to access information from servers and other computers.
User defined data type – a data type based on an existing data type or other data types that have been defined by a programmer.
WPAN – wireless personal area network; a local wireless network which connects together devices in very close proximity (such as in a user’s house); typical devices would be a laptop, smartphone, tablet and printer.
Utility program – parts of the operating system which carry out certain functions, such as virus checking, defragmentation or hard disk formatting. Validation – method used to ensure entered data is reasonable and meets certain input criteria. Variable – a named value that can change during the execution of a program.
Write – file access mode in which data can be written to a file; any existing data stored in the file will be overwritten. Zero compression – way of reducing the length of an IPv6 address by replacing groups of zeroes by a double colon (::); this can only be applied once to an address to avoid ambiguity.
552
457591_GLO_CI_AS & A_Level_CS_541-552.indd 552
25/04/19 2:17 PM
1D arrays 241–2 2D arrays 242–3 3D printers 79
A A* algorithm 425, 429–34, 541 abnormal test data 294, 298, 541 absolute addressing 121, 125, 541 abstract data types (ADTs) 238, 250–9, 464–89, 541 binary trees 451, 481–7, 542 graphs see graphs implementing one ADT from another ADT 488–9 linked lists 238, 250–1, 255–9, 464, 469–81, 547 queues 238, 250–1, 253–5, 464, 466–9, 549 stacks 238, 250–3, 464–6, 551 abstraction 217, 218–19, 541 acceptance testing 294, 299, 541 access rights databases 208, 210, 541 data security 159, 161, 541 accumulator (ACC) 108, 109, 110, 541 accuracy 323–4 acknowledgement 170, 175, 541 actuators 84 adaptive maintenance 294, 299, 541 addition 3–5 address bus 108, 112, 541 addressing modes 121, 125–6, 541 Advanced Research Project Agency Network (ARPAnet) 28, 29, 541 advertising 193 aggregation (containment) 499, 514–15, 543 Airbus A380 incompatible software issue 184 algorithms 219–35, 450–90, 541 abstract data types see abstract data types (ADTs) comparing 489–90 insertion and bubble sorting methods 458–64 linear and binary searching methods 451–7 page replacement 373, 388–9, 548 shortest path algorithms 425–34 writing 220–35 alpha testing 294, 299, 541 Amazon 33–4 analogue to digital converter (ADC) 19, 69, 81, 84, 541 analysis 283, 284, 285, 286, 541 AND gates 89, 91, 100 multi-input 101–2 anti-lock braking systems (ABS) 87
Index
Index anti-spy software 160, 163, 541 antivirus software 137, 144, 163, 541 append file access mode 525, 531, 533, 541 application layer 329, 330–3 approximations 320–2 arguments 275, 280, 541 arithmetic-logic unit (ALU) 108, 109–10, 541 arithmetic operation instructions 124 arithmetic shift 130, 541 ARPAnet 28, 29, 541 arrays 238, 241–8, 541 artificial intelligence (AI) 189–93, 425–49, 541 impacts on society, the economy and the environment 190–3 machine learning, deep learning and 434–45 shortest path algorithms 425–34 artificial neural networks 435, 439–41, 444, 541 ASCII code 2, 12–14, 541 assemblers 121, 122–3, 150, 151, 541 assembly language 121–9, 541 instructions 123–5 simple programs 126–8 stages of assembly 122–3 Association for Computing Machinery (ACM) 179, 181, 541 asymmetric encryption 410, 413–14, 541 asynchronous serial data transmission 108, 114, 541 attributes classes 498, 501, 541 databases 197, 200, 541 audio compression 21, 541 authentication 159, 160, 541 authenticity 411 auto-documenter 157 automatic repeat request (ARQ) 170, 175, 541
B backing up data 167 back propagation 435, 441, 444–5, 541 back-up utility software 137, 146, 541 Backus-Naur form (BNF) notation 394, 397, 398, 400, 541 bad sectors 137, 143, 541 base case 490–1, 541 basic input/output system (BIOS) 108, 113, 542 Belady’s anomaly 373, 388, 541 beta testing 294, 299, 541 bidirectional buses 108, 112, 541 Big O notation 451, 489–90, 542
binary-coded decimal (BCD) system 2, 10–12, 542 binary files 329, 332, 542 binary floating-point numbers 313–25, 542 converting denary numbers into 317–25 converting into denary 314–17 binary number system 2–8, 542 converting between denary and 2–3 converting between hexadecimal and 8–9 binary search 451, 454–7, 542 binary shifts 130–1 binary trees 451, 481–7, 542 finding items 482–3 inserting items 484–7 writing programs for 517–21 binder 3D printing 69, 79, 542 biometrics 160, 163–4, 542 BIOS 108, 113, 542 birefringence 69, 76, 542 bit depth (sampling resolution) 15, 16, 20, 24, 542, 550 bit manipulation 130–2 bit-map images 15–18, 542 calculating file size 17–18 compared with vector graphics 18–19 file compression 22 bit rate 21, 22, 29, 52, 542 bits 2, 542 bit streaming 29, 52–3, 542 BitTorrent protocol 329, 335–7, 542 black and white images 23 black-box testing 294, 299, 542 block chaining 410, 411, 542 block cipher 410, 411, 542 blocked state 377–8 Bluetooth 28, 41–2, 542 protocols 335 Blu-ray® discs 76 Boolean algebra 89, 90–2, 95, 354–6, 542 and logic circuits 361–8 simplification using 355–6 bootstrap program 373, 374, 542 bots 165 boundary test data 294, 298, 542 break 273 breakpoint 150, 155, 542 bridges 28, 47, 542 British Computer Society (BCS) 179, 180, 541 broadcast 29, 50–1, 542 bubble sort 238, 245–8, 458–60, 542 buffering 29, 52, 542 bugs 294–6 burst time 373, 376, 542 buses 109, 112–14 553
457591_Ind_CI_AS & A_Level_CS_553-560.indd 553
4/30/19 8:07 AM
Index
bus network topology 28, 37, 39, 542 by reference method 275, 277–8, 542 bytecode 152–3 bytes 6–7 by value method 275, 277, 542
C cache memory 68, 69, 71, 108, 113, 542, 547 Cambridge Analytica scandal 193 candidate keys 197, 200–1, 542 capacitive touch screens 69, 82–3, 542 carrier sense multiple access with collision avoidance (CSMA/CA) 335 carrier sense multiple access with collision detection (CSMA/CD) 29, 51, 543 car sensors 86–7 CASE statements 222, 223, 271–3 CDs 75–6 cellular networks 56 central processing unit (CPU) architecture 107–20 components 109–10 computer ports 108, 114–16, 549 fetch-execute cycle 108, 116–18, 544 interrupts 108, 118–19 registers 108, 109, 110–11, 550 system buses 109, 112–14 certificate authority (CA) 416, 418, 420, 421, 542 character set 2, 12, 542 chatbots 435, 442–3, 542 check digits 169, 171, 542 checksums 169, 172, 340, 542 ciphertext 410, 411, 542 circuit switching 55, 337, 338, 542 comparison with packet switching 340–1 CISC (complex instruction set computer) 347, 348, 542 classes 307, 498, 501, 542 classless inter-domain routing (CIDR) 54, 58, 59, 542 CLI (command line interface) 137, 138–9, 542 client/server network model 28, 32–4, 35–6, 542 clock cycle 108, 113, 542 clock page replacement 388–9 close (file processing) 525, 542 cloud software 41 cloud storage 28, 39–41, 543 clusters 347, 352, 543 CMOS 137, 138, 543 coaxial cables 28, 44, 543 code generation 394, 395, 397, 543 codes of conduct 180–3 coding 283, 284, 285, 286, 543 collisions 29, 50–1, 543 colour depth 15, 16, 24, 543 coloured images 24 colouring monochrome photos 442
combination circuits 354, 358, 543 command line interface (CLI) 137, 138–9, 542 commercial software 187 communication 27–67, 328–45 circuit switching and packet switching 337–43 internet see internet networking 28–53 protocols 328–37 compare instructions 125 compilation, stages in 395–8 compilers 149, 151–2, 155, 394–5, 543 composite data types 238, 240, 306–7, 543 composite key 197, 543 computational thinking skills 450–97 algorithms see algorithms recursion 490–4 skills 217–19 computer-assisted translation (CAT) 441 computer ethics 180–1 conditional instructions 125 conditional loops 456 confidentiality 411 conflicts 29, 50, 543 constants 264, 265–71, 543 constructors 499, 515–16, 543 containment (aggregation) 499, 514–15, 543 context switching 373, 379, 381, 543 contiguous 137, 140, 543 single (contiguous) memory allocation 383 control 85–7, 130, 131–2, 543 control bus 108, 112, 543 control unit 108, 109, 110, 543 copyright issues 186–9 cores 108, 113, 543 corrective maintenance 294, 299, 543 count-controlled loops 274, 275 criminal justice system 192 cross-coupling 354, 358–9, 543 CSMA/CA (carrier sense multiple access with collision avoidance) 335 CSMA/CD (carrier sense multiple access with collision detection) 29, 51, 543 culture 179, 543 current instruction register (CIR) 108, 110, 116, 117, 543 cyclic shift 130, 543
D database management systems (DBMSs) 208–10, 543 databases 196–208, 543 normalisation 203–7 data bus 108, 112, 543 data definition language (DDL) 211–12, 543 data dependency 209 data dictionary 208, 209, 543
data entry, verification during 170–2 datagrams 330 data hiding 498, 503, 543 data inconsistency 209 data integrity 169–76, 411, 543 data input instructions 124 data-link layer 329, 330, 334–7 data loss in cloud storage 40–1 preventing 160–4 data management 208, 209, 543 data manipulation language (DML) 211, 213–14, 543 data modelling 208, 210, 543 data movement instructions 123–4 data output instructions 124 data privacy 159, 160, 543 data protection laws 159, 160, 543 data recovery 167 data redundancy 28, 40, 209, 543 data representation 2–15, 304–27 ASCII code 2, 12–14, 541 file organisation and access 308–11 floating-point numbers 312–25 number systems 2–12 Unicode 2, 14, 15, 552 user-defined data types 304–7, 552 data security 40, 159–68, 410–24, 543 digital signatures and digital certificates 418–23 encryption 160, 163, 410–14, 544 protocols 416–18 quantum cryptography 414–15, 549 when using cloud storage 40–1 data transfer, verification during 172–5 data types 238–41, 543 abstract see abstract data types (ADTs) composite 238, 240, 306–7, 543 non-composite 305–6, 547 debugging 150, 155–6, 543 declarative programming 499, 521–4, 543 decomposition 217, 219, 330, 543 deep learning 434, 435, 439–43, 543 default 273 defragmentation software 144–5 De Morgan’s Laws 355 denary numbers 2, 7–8 converting between binary numbers and 2–3 converting binary floating-point numbers into 314–17 converting into binary floating-point numbers 317–25 design 283, 284, 285, 286, 543 destructors 499, 515, 517, 543 developer interface 209, 210, 543 device driver 137, 543 dictionaries 451, 488–9, 544 digest 418, 419, 420, 544 digital certificates 418, 420–2, 544 digital rights management (DRM) 186, 187, 544 digital signatures 162, 418, 419–20, 544
554
457591_Ind_CI_AS & A_Level_CS_553-560.indd 554
4/30/19 8:07 AM
E eavesdropper 410, 411, 544 electromagnetic radiation 42–3 electronically erasable programmable read-only memory (EEPROM) 69, 74, 544 Else 273 embedded systems 72 emulation 392, 544 encapsulation 498, 501, 544 encryption 160, 163, 410–14, 544 entities 197, 200, 544 entity-relationship (E-R) diagrams 197, 202, 544 enumerated data types 305, 544 erasable PROM (EPROM) 69, 72, 544 errors in programs 294–6 Ethernet 29, 50–1, 544 protocols 334–5 ethical hacking 160, 164, 544 ethics 179–85, 544 even parity 169, 173, 544 exception handling 525, 535–7, 544 exceptions 525, 535–6, 544 exploding laptop computers 184 exponent 313–24, 544 extreme test data 294, 298, 544
F face recognition software 439–40 facts 499, 521–3, 544 false positives 137, 544 faults in programs 294–6 Federation Against Software Theft (FAST) 186
fetch-execute cycle 108, 116–18, 544 fibre optic cables 28, 44, 544 fields 197, 199–200, 544 file access 308, 309–11, 544 file-based approach 197–9 how a DBMS addresses limitations of 209–10 file compression 21–5, 145 file organisation 308–9, 544 file processing 525–35 adding records 531–5 finding records 535 storing records 526–31 files 238, 249–50, 544 bit-map image file sizes 17–18 management of 142 file server 28, 30, 545 file transfer protocol (FTP) 329, 330, 331–2, 545 fingerprint scans 164 finite state machine (FSM) 287, 292, 545 firewalls 48, 160, 162–3, 545 first come first served (FCFS) scheduling 379, 381 first in first out (FIFO) page replacement 373, 388, 545 first normal form (1NF) 197, 203, 204–5, 545 flags 108, 111, 545 flash memory 69, 74–5, 374, 545 flip-flop circuits 354, 358–61, 545 floating-point numbers 312–25 flooding 38 flowcharts 219, 220, 221, 545 writing pseudocode from 229–31 foreign keys 197, 200, 201, 545 FOR loops 225, 226 FOR ... NEXT loops 274, 275 fragmentation 69, 73, 545 frame rate 15, 21, 24, 545 frames memory blocks 373, 383–4, 545 packets 330 Free Software Foundation 186, 187–8, 545 freeware 186, 189, 545 FTP (file transfer protocol) 329, 330, 331–2, 545 full adder circuits 354, 357–8, 545 functions 264, 269, 545 string manipulation functions 269–71 structural programming 278–80
G gateways 28, 45, 48, 49, 545 general AI 435 general case 490–1, 545 getters 499, 515, 516, 545 graphical user interface (GUI) 137, 138, 139, 545 graphs 451, 487, 545 shortest path algorithms 425–34
gray codes 354, 363, 364, 545 guest operating system (OS) 392, 393, 545 GUI (graphical user interface) 137, 138, 139, 545
Index
digital to analogue converter (DAC) 69, 80–1, 84, 544 Dijkstra’s algorithm 425–9, 544 direct 3D printing 69, 79, 544 direct access 308, 310, 311, 544 direct addressing 121, 125, 544 direct memory access (DMA) controller 373, 375, 544 dirty pages 373, 383, 544 disk compression 137, 145, 544 disk content analysis software 137, 145, 544 disk defragmenter 137, 145, 544 disk formatter 137, 143, 544 disk thrashing 373, 386, 544 DNS cache poisoning 160, 166, 544 DO ... ENDWHILE loops 274–5 domain name system/service (DNS) 54, 61–2, 330, 544 double entry 171 dry runs 294, 296–8, 544 dual core 108, 113, 544 dual layering 69, 75–6, 544 DV (digital video) cameras 20, 21 DVDs 75–6 dynamic link files (DLL) 138, 148, 544 dynamic RAM (DRAM) 68, 70–1, 544
H hacking 160, 164, 179, 545 half adder circuits 354, 356–7, 545 handshake 416, 417, 418, 545 hard disk drives (HDDs) 69, 73–4, 545 hardware 30, 68–106, 346–71 Boolean algebra and logic circuits 354–68 computers and their components 68–89 logic gates and logic circuits 89–104 needed to support the internet 55–7 processors and parallel processing 346–53 requirements of networks 45–50 hardware management 137, 142, 545 hashing algorithms cryptography 418, 419, 420, 545 file access 308, 309, 310–11, 545 HCI (human-computer interface) 137, 138, 545 headers data packets 337, 340–1, 342, 545 procedures or functions 275, 280, 545 heuristic checking 137, 144, 545 heuristics 425, 430, 545 hexadecimal number system 2, 7–10, 545 high-bandwidth digital copy protection (HDCP) 108, 115, 545 high-definition multimedia interface (HDMI) 108, 115, 116, 545 hop number/hopping 337, 340, 545 host 329, 333, 545 host operating system (OS) 392, 393, 545 host-to-host protocol 329, 333–4, 545 HTTP (hypertext transfer protocol) 329, 330–1, 545 hubs 28, 37, 45–6, 545 repeating 28, 46–7, 550 human-computer interface (HCI) 137, 138, 545 hybrid cloud 40 hybrid networks 28, 39, 545 HyperText Mark-up Language (HTML) 54, 55, 545 scripting in 62–4 hypervisor 392, 545
I icons 137, 545 identifier 238, 239, 545 identifier tables 221, 227, 230, 233, 244, 246, 265, 290 IEEE 50, 179, 180–1, 546 IF statements 222, 223, 224, 271, 457 555
457591_Ind_CI_AS & A_Level_CS_553-560.indd 555
4/30/19 8:07 AM
Index
image resolution 15, 16–17, 24, 546 images general file compression methods 24 run-length encoding with 23–4 IMAP (internet message access protocol) 329, 330, 332–3, 546 immediate access store (IAS) 108, 110, 546 immediate addressing 121, 126, 546 imperative programming 498, 500–1, 546 in demand paging 373, 385–6, 546 index array 238, 241, 546 database 197, 202, 546 indexed addressing 121, 125, 546 indirect addressing 121, 125, 546 infrared radiation 42–3 inheritance 499, 505–9, 514–15, 546 inkjet printers 78 input data instructions 124 input devices 81, 84–7 input/output (I/O) system 374–5 insertion sort 451, 461–4, 546 instances 498, 502–4, 546 Institute of Electrical and Electronics Engineers (IEEE) 50, 179, 180–1, 546 instructions 121–2, 546 assembly language instructions 123–5 instruction set 121, 122, 546 integrated development environments (IDEs) 150, 151, 153–7, 546 integration testing 294, 299, 546 integrity, data 169–76, 411, 543 intellectual property rights 179, 180, 546 copyright issues 186–9 internet 54–65, 187, 546 communication and internet technologies 328–45 hardware and software needed 55–7 IP addresses 57–61 TCP/IP protocols 57, 329–37 internet message access protocol (IMAP) 329, 330, 332–3, 546 internet/network layer 329, 330, 334–7 internet protocols (IPs) 54, 57–61, 334, 546 internet service providers (ISPs) 54, 55, 546 interpreters 149, 151–2, 155, 394–5, 546 interrupt dispatch table (IDT) 373, 382, 546 interrupt priority 108, 119, 546 interrupt priority levels (IPL) 373, 382, 546 interrupts 108, 118–19, 349–50, 382, 546 interrupt service routine (ISR) (interrupt handler) 108, 118, 119, 546 IPv4 addressing 54, 57–8, 546 IPv6 addressing 54, 58–9, 546 iterative model 283, 286, 546
J Java 228, 239, 271 binary search 456, 457 bubble sort 460 case statements 273 constants and variables 266, 267, 268, 269 exception handling 537 file processing 529–31, 533 functions 278, 280 IF statement 224 insertion sort 463 linear search 453–4 linked lists 473, 476, 480 loops 274, 275 OOP 503, 504, 508–9, 512, 514 procedures 275, 276, 277, 278 queues 466, 467, 468 recursion 492 stacks 464, 465, 466 writing programs for binary trees 518, 520, 521 JavaScript 54, 63, 546 JK flip-flops 360–1 JPEG 21, 22, 546
K Karnaugh maps (K-maps) 354, 363–7, 546 kernel 373, 375–6, 546 key distribution problem 410, 412, 546 keyword table 396
L LA airport shutdown 184 labelled data 434, 437–8, 440, 441, 546 language translation 149–57, 394–402 LANs (local area networks) 28, 29, 31, 32, 546 laser printers 77 latency 69, 73, 74, 351–2, 546 least recently used (LRU) page replacement 373, 388, 546 leeches 329, 336, 337, 546 left shift 130, 131, 546 legality 179, 546 lexical analysis 394, 395–7, 546 library programs 138, 147–8, 546 library routines 138, 147–8, 264, 271, 546 linear search 238, 243–4, 451–4, 546 linked lists 238, 250–1, 464, 469–81, 547 deleting items 477–81 finding items in 469–74 inserting items 474–7 linked list operations 255–9 local area networks (LANs) 28, 29, 31, 32, 546 logical memory 373, 383–4, 547 logical schema 208, 210, 547 logical shift 130, 547 logic bombs 165
logic circuits 89, 92–101, 356–68, 547 Boolean algebra and 361–8 flip-flop circuits 358–61 half adder and full adder circuits 356–8 in the real world 99–101 simplification 101 logic errors 150, 155, 295, 547 logic gates 89–92, 547 multi-input 101–4 loops 274–5, 456 writing algorithms 220–9 lossless file compression 21, 547 lossy file compression 21, 547 lower bound 238, 241–2, 547 low-level programming 498, 499–500, 547 low level scheduling 373, 377, 547 lurkers 329, 336, 547
M machine code 121–2, 547 machine learning 193, 434, 435, 436–9, 443, 547 maintenance 283, 284, 285, 286, 294, 299, 547 malicious hacking 160, 164, 547 malware 160, 162, 164–6, 547 MANs (metropolitan area networks) 28, 30, 32, 547 mantissa 313–24, 547 mask 130, 547 massively parallel computers 347, 352, 547 memory 69–77 measurement of size 6–7 memory cache 68, 69, 71, 108, 113, 542, 547 memory dumps 2, 9–10, 547 memory management 137, 140–1, 373, 382–5, 389, 547 memory optimisation 137, 140, 382, 547, 548 memory organisation 137, 140, 547 memory protection 137, 140–1, 547 memory sticks (flash memories) 69, 74–5, 374, 545 mesh network topology 28, 38, 547 metadata 329, 335, 547 methods 498, 501, 547 object methods 515–17 metropolitan area networks (MANs) 28, 30, 32, 547 microphones 81 microwave radiation 42–3 MIMD (multiple instruction multiple data) 347, 351, 352, 547 MIME (multi-purpose internet mail extension) protocol 329, 332, 547 MISD (multiple instruction single data) 347, 351, 547 modems 28, 48–9, 547 modulo-11 169, 171, 547
556
457591_Ind_CI_AS & A_Level_CS_553-560.indd 556
4/30/19 8:07 AM
N NAND gates 89, 91, 100 narrow AI 435 negative numbers 3–4 converting binary floating-point numbers into denary 315–17 converting denary numbers into binary floating-point numbers 319–20, 323 normalisation 322 network/data-link layer 329, 330, 334–7 networking 28–53 bit streaming 29, 52–3, 542 client/server model 28, 32–4, 35–6, 542 devices 29–32 Ethernet 29, 50–1, 544 hardware requirements 45–50 peer-to-peer model 28, 34–5, 548 public and private cloud computing 39–41 topologies 36–9 wired and wireless 41–5 network interface cards (NICs) 28, 49, 547 wireless 29, 50, 552 nodes networks 28, 34, 547 vertices (in graphs) 425–34, 547 non-composite data types 305–6, 547 non-preemptive scheduling 373, 376, 547 non-repudiation 411 NOR gates 89, 91 normalisation databases 197, 203–7, 547 floating-point numbers 313, 322, 547 normal test data 294, 298, 547 NOT gates 89, 90, 100 number systems 2–12 BCD 2, 10–12, 542 binary 2–8, 8–9, 542 hexadecimal 2, 7–10, 545
O object code 121, 123, 548 object-oriented programming (OOP) 498, 501–21, 548 containment 499, 514–15, 543 inheritance 499, 505–9, 514–15, 546 object methods 515–17 polymorphism and overloading 509–14 writing a program for a binary tree 517–21 objects 307, 498, 502–4, 547
odd parity 169, 173, 548 on demand (bit streaming) 29, 53, 548 one’s complement 2, 3, 548 opcode 121, 122, 548 open (file processing) 525, 533, 548 Open Source Initiative 186, 187–9, 548 operand 121, 122, 548 operating systems (OS) 136–49, 372–92, 548 memory management 137, 140–1, 373, 382–5, 389, 547 need for 138–9 page replacement 373, 388–9, 548 process management 137, 142, 376–7, 389, 549 process states 373, 377–82, 549 program libraries 138, 147–8, 549 resource maximisation 374–6 tasks 140–2 utility software 137, 143–6, 552 virtual memory 373, 385–7, 552 optical storage 69, 75–6, 548 optimal page replacement (OPR) 373, 388, 548 optimisation compilation 394, 395, 398, 548 memory management 137, 140, 382, 547, 548 organic light emitting diode (OLED) 69, 81–2, 548 OR gates 89, 91, 100 multi-input 102–3 OTHERWISE 271–3 output data instructions 124 output devices 77–84 overclocking 108, 113, 548 overflow errors 313, 325, 548 overloading 499, 509, 513–14, 548
P packets 28, 37, 329, 330, 548 packet switching 56, 337, 339–43, 548 compared with circuit switching 340–1 page fault 373, 388, 389, 548 page replacement 373, 388–9, 548 pages 373, 383–4, 548 page tables 373, 383–4, 548 paging 373, 383–4, 385, 548 using virtual memory 385–7 PANs (personal area networks) 28, 32, 548 parallel processing 347, 350–3, 548 parameters 275, 276–8, 280, 548 functions with and without 279 parity bit 169, 173, 548 parity blocks 169, 174, 548 parity byte 170, 174–5, 548 parity checks 169, 173–5, 329, 548 partial compiling and interpreting 152–3 passwords 161–2 pattern recognition 217, 219, 548
peers 329, 335, 548 peer-to-peer file sharing 335–7 peer-to-peer network model 28, 34–5, 548 perceptual music shaping 21, 22, 548 perfective maintenance 294, 299, 548 personal area networks (PANs) 28, 32, 548 pharming 160, 166, 548 phishing 160, 165–6, 548 phone calls 55–7 photographic (bit-map) images 22 photographs enhancing 442 turning monochrome photos into colour photos 442 PHP 54, 63–4, 548 physical memory 373, 383–4, 548 pieces 329, 335, 548 piezoelectric technology 78 pinching and rotating 137, 139, 548 pipelining 347, 348–50, 548 piracy 186, 548 pixel density 15, 17, 440, 548 pixels 15–16, 82, 548 plagiarism 179, 180, 548 plaintext 410, 411, 548 pointer data types 305, 306, 548 polymorphism 499, 509–13, 549 POP (or POP3/4) (post office protocol) 329, 330, 332–3, 549 ports 108, 114–16, 549 positive feedback 84, 354, 359, 549 positive numbers converting binary floating-point numbers into denary 314–15 converting denary numbers into binary floating-point numbers 317–19, 323, 324 normalisation 322 post-condition loops 274 post office protocol (POP or POP3/4) 329, 330, 332–3, 549 post-WIMP 137, 139, 549 precision 323–4 pre-condition loops 274–5 pre-emptive scheduling 373, 376, 549 prettyprinting 149, 154, 549 primary keys 197, 200–1, 549 primary memory 70–3 printers 77–9 privacy 179, 549 data privacy 159, 160, 543 software copyright and 186–7 private cloud 40 private IP addresses 54, 61, 549 private keys 410, 413–14, 549 private networks 31 procedures 264, 271, 275–8, 280, 549 process control block (PCB) 373, 377, 549 processes 373, 376, 549 process management 137, 142, 376–7, 389, 549
Index
monitoring 85–7, 130, 131–2, 547 morality 179, 547 motion JPEG 20 movie files 20–1, 24 MPEG-3 (MP3) files 21–2, 547 MPEG-4 (MP4) files 21, 22, 547 multi-input logic gates 101–4 multimedia 15–21 multitasking 373, 376, 547
557
457591_Ind_CI_AS & A_Level_CS_553-560.indd 557
4/30/19 8:07 AM
Index
processors 107–35, 346–53 assembly language 121–9, 541 bit manipulation 130–2 CPU architecture 107–20 parallel processing 350–3 RISC and CISC processors 347–50 process priority 377 process states 373, 377–82, 549 product key 186, 549 professional ethical bodies 180–3 program counter (PC) 108, 116, 117, 549 program design 287–93 program development lifecycle 283–7, 549 different development lifecycles 285–7 purpose 284 stages 284–5 program libraries 138, 147–8, 549 programmable ROM (PROM) 69, 72, 549 program maintenance 283, 284, 285, 286, 294, 299, 547 program testing 283, 284, 285, 286, 293–4, 296–9, 551 programming 264–82, 498–540 basics 264–71 constants and variables 265–71 constructs 271–5 exception handling 525, 535–7, 544 file processing operations 525–35 library routines 264, 271 structured 275–80 programming paradigms 498–525, 549 declarative programming 499, 521–4, 543 imperative programming 498, 500–1, 546 low-level programming 498, 499–500, 547 OOP 498, 501–21, 548 properties 498, 502, 549 protocols 328–37, 549 security and 416–18 prototyping 287 pseudocode 219, 220, 221–33, 549 structure charts 289–91 writing algorithms using 221–9 writing from a flowchart 231–3 writing from a structured English description 229–31 public cloud 40 public impact of hardware or software 183–5 public IP addresses 54, 61, 549 public key infrastructure (PKI) 416, 418, 549 public keys 410, 413–14, 549 public networks 31 public switched telephone network (PSTN) 54, 55, 549 pull protocols 329, 332–3, 549 push protocols 329, 332, 549 Python 228, 239, 271, 273
binary search 456, 457 bubble sort 459 constants and variables 266, 267, 268 exception handling 536 file processing 527–8, 533 functions 278, 280 IF statement 224 insertion sort 463 linear search 452 linked lists 471, 476, 479 loops 274, 275 OOP 502, 503, 506, 510, 513 procedures 275, 276, 277, 278 queues 466, 467, 468 recursion 491 stacks 464, 465 writing programs for binary trees 518, 519, 520
Q quad core 108, 113, 549 quantum 373, 376, 379, 381, 549 quantum cryptography 414–15, 549 quantum key distribution (QKD) 414, 549 quarantine 137, 144, 549 qubit 414, 549 query processor 209, 210, 549 queues 238, 250–1, 464, 466–9, 549 queue operations 253–5
R radio waves 42–3 random access memory (RAM) 68, 70–1, 72, 374, 385, 549 random file organisation 308, 309, 310, 549 adding records to random files 533–5 finding records in random files 535 range 323–4 rapid application development (RAD) 283, 286–7, 549 read file access mode 525, 526–7, 549 read-only memory (ROM) 68, 70, 71–2, 549 ready state 377–8 real-time (bit streaming) 29, 53, 549 record protocol 417 records database 196, 199–200, 549 data type 238, 240–1, 549 recursion 490–4, 549 referential integrity 197, 201, 549 refreshed 68, 71, 550 registers 109, 110–11, 550 Register Transfer Notation (RTN) 108, 117–18, 550 regression 435, 445, 550 reinforcement learning 434, 439, 550 relational databases 196, 198–207, 550 relationships 197, 201–2, 550 relative addressing 121, 126, 550 removable hard disk drives 69, 74, 550
repeaters 28, 46–7, 550 repeating hubs 28, 46–7, 550 REPEAT ... UNTIL loops 225, 226, 227, 274 report window 150, 155–6, 550 resistive touch screens 69, 83, 550 resolution 15, 17, 550 resource management 374–6 retina scans 164 RETURN 279 Reverse Polish notation (RPN) 394, 400–1, 550 reward and punishment 434, 439, 550 right shift 130, 131, 550 RISC (reduced instruction set computer) 347–8, 550 robotics 190 ROM (read-only memory) 68, 70, 71–2, 549 rounding errors 320–2 round robin scheduling 373, 378–9, 381, 550 routers 28, 47–8, 49, 330, 550 routing 38 routing tables 337, 341–2, 550 rules 499, 521–4, 550 run-length encoding (RLE) 21, 22–4, 550 with images 23–4 with text data 23 running state 377–8 runtime environment with a debugger 155–6 run-time errors 294, 296, 550
S sampling rate 15, 20, 24, 550 sampling resolution/bit depth 15, 16, 20, 24, 542, 550 satellites 43, 56–7 scalable vector graphics (SVG) 22 scheduling 373, 374, 376–82, 550 routines 379–81 screen resolution 15, 16–17, 69, 82, 550 screens 82–4 secondary keys 197, 200, 550 secondary storage 70, 72–7 second normal form (2NF) 197, 203, 205, 550 Secure Sockets Layer (SSL) 416–17, 417–18, 550 digital certificate 421 security see data security security management 137, 141, 550 seeds 329, 336, 337, 550 segmentation 384–5 segment map table 373, 384, 550 segment numbers 373, 384, 550 segments memory 373, 384–5, 550 transport layer 329, 330, 550 semi-supervised (active) learning 434, 439, 550 sensors 69, 84–7, 550
558
457591_Ind_CI_AS & A_Level_CS_553-560.indd 558
4/30/19 8:07 AM
spread spectrum technology 28, 31, 551 spyware 165 SQL scripts 211–14, 521, 551 SR flip-flops 358–60 stacks 238, 250–3, 464–6, 551 stack operations 251–3 star network topology 28, 37–8, 39, 551 starving a process 373, 376, 551 state-transition diagrams 287, 292–3, 551 state-transition tables 287, 292, 551 static libraries 148 static RAM (SRAM) 68, 70–1, 551 status register 108, 109, 110, 111, 551 stepwise refinement 219, 233–5, 551 storage devices 69–77 stream cipher 410, 411, 551 strings 269 manipulation functions 269–71 strong AI 435 structure charts 287, 288–92, 551 structured English 219, 220, 551 writing pseudocode from a structured English description 229–31 structured programming 275–80 structured query language (SQL) 209, 210, 211, 551 SQL scripts 211–14, 521, 551 stub testing 294, 299, 551 sub-netting 54, 59–61, 551 subtraction 5–6 sum of products (SoP) 354, 361, 551 super computers 347, 352, 551 supervised learning 434, 438, 551 swap space 373, 385, 551 swarm 329, 336, 551 switches 28, 37, 46, 551 symbolic addressing 121, 126, 551 symmetric encryption 410, 411–12, 551 syntax analysis 394, 395, 397, 551 syntax diagrams 394, 398–400, 551 syntax errors 150, 155, 295, 551 system buses 109, 112–14 system clock 108, 109, 110, 113, 551 system software 136–58 language translation 149–57, 394–402 operating systems see operating systems
T tables 196, 199–200, 551 TCP (transmission control protocol) 329, 333–4, 551 TCP/IP protocols 57, 329–37 terminology databases 441 test data 298 testing 283, 284, 285, 286, 293–4, 296–9, 551 test plans 294, 296, 298, 551 test strategy 294, 296, 551 text data, RLE on 23 text files 249–50 text mining 441
thermal bubble technology 78 thick clients 28, 35–6, 551 thin clients 28, 35–6, 551 third normal form (3NF) 197, 203–4, 206–7, 551 thrash point 373, 386, 551 time complexity 489 timeout 170, 175, 551 tokenisation 396 touch screens 69, 82–3, 552 trace tables 128, 294, 295, 297, 552 tracker 329, 336, 552 translation lookaside buffer (TLB) 373, 383, 552 translation memories 441 translation software 149–57, 394–402 translators 149, 150–1, 552 transmission control protocol (TCP) 329, 333–4, 551 TCP/IP protocols 57, 329–37 transport 192 transport layer 329, 330, 333–4 Transport Layer Security (TLS) 416, 417–18, 552 Trojan horses 165 truth tables 89, 90–8, 552 tuples 197, 200, 552 twisted pair cables 28, 44, 552 two pass assemblers 122–3 two’s complement 2, 3–4, 552
Index
sequential access 308, 309–10, 550 sequential circuits 354, 358, 550 sequential file organisation 308, 309–10, 550 adding records to sequential files 531–3 storing records in sequential files 526–31 serial access 550 serial file organisation 308, 309, 531, 550 storing records in serial files 526–31 services 30 session caching 416, 417, 550 sets 305, 307, 550 setters 499, 515, 516, 551 shareware 186, 189, 551 shifts 130–1, 551 shortest job first (SJF) scheduling 379–80, 381 shortest path algorithms 425–34 shortest remaining time first (SRTF) scheduling 379–80, 381 sign and magnitude 2, 3, 551 SIMD (single instruction multiple data) 347, 350, 352, 551 simple mail transfer protocol (SMTP) 329, 330, 332, 351 simplification of logic circuits 101 using Boolean algebra 355–6 single (contiguous) memory allocation 383 single pass assemblers 122 single stepping 150, 155, 551 SISD (single instruction single data) 347, 350, 551 SMTP (simple mail transfer protocol) 329, 330, 332, 551 softmodem 28, 49, 551 software 30, 136–58 cloud software 41 copyright and privacy 186–7 language translation 149–57, 394–402 licensing 187–9 needed to support the internet 55–7 operating systems see operating systems software development 283–303 program design 287–93 program development lifecycle 283–7 program testing and maintenance 293–300 Software Engineering Code of Ethics 181–3 solid state drives (SSDs) 69, 74–5, 551 sound files 19–20 source code 121, 122, 551 source code editor 154–5 space complexity 490 speakers 80–1 spread spectrum frequency hopping 28, 41–2, 551
U unconditional instructions 125 underflow errors 313, 325, 552 Unicode 2, 14–15, 552 unidirectional buses 108, 112, 552 uniform resource locators (URLs) 54, 55, 61, 552 Universal Serial Bus (USB) ports 108, 114–15, 552 unlabelled data 434, 437, 440–1, 552 unsupervised learning 434, 438–9, 552 unwinding 490, 491, 552 upper bound 238, 241–2, 552 USB ports 108, 114–15, 552 use of data 193 user accounts 159, 160–1, 552 user-defined data types 304–7, 552 utility programs 137, 143–6, 552
V validation 169–70, 552 variables 264, 265–71, 552 VB 228, 239, 271 binary search 456, 457 bubble sort 459–60 case statements 273 constants and variables 266, 267, 268 exception handling 536–7 file processing 528–9, 533 functions 278, 280 IF statement 224 insertion sort 463 559
457591_Ind_CI_AS & A_Level_CS_553-560.indd 559
4/30/19 8:07 AM
Index
linear search 453 linked lists 472, 476, 479–80 loops 274, 275 OOP 502, 504, 506–7, 510–11, 513–14 procedures 275, 276, 277, 278 queues 466, 467, 468 recursion 492 stacks 464, 465 writing programs for binary trees 518, 519, 520 vector graphics 15, 18–19, 552 file compression 22 verification 169, 170–6, 552 during data entry 170–2 during data transfer 172–5 vertices (nodes) 425–34, 547 video 20–1, 24 Video Graphics Array (VGA) 108, 115–16, 552 virtual machines (VMs) 392–4, 552 virtual memory 373, 385–7, 552 virtual memory systems 137, 140, 552 virtual reality headsets 69, 83–4, 552 virus checkers 144 viruses 164–5
visual check 171 Voice over Internet Protocol (VoIP) 54, 55, 56, 552 Von Neumann architecture 108, 109, 552
W walkthrough 294, 298, 552 WANs (wide area networks) 28, 29–30, 32, 552 WAPs (wireless access points) 28, 31, 552 waterfall model 283, 285, 552 web browsers 54, 55, 61, 552 web crawler 435, 439, 552 WHILE ... DO ... ENSEMBLE 274–5 white-box testing 294, 299, 552 wide area networks (WANs) 28, 29–30, 32, 552 Wi-Fi 28, 41–2, 552 WiMax (worldwide interoperability for microwave access) 335 WIMP (windows, icons, menu and pointing device) 137, 138, 552 winding 490, 491, 552
wired networking 43–5 vs wireless 44–5 wireless access points (WAPs) 28, 31, 552 wireless LANs (WLANs) 28, 31, 552 wireless networking 41–3, 44–5 wireless network interface cards/ controllers (WNICs) 29, 50, 552 wireless personal area networks (WPANs) 28, 42, 552 wireless (Wi-Fi) protocols 335 word 108, 112, 552 World Wide Web (WWW) 54–5, 187, 552 worms 165 WPANs (wireless personal area networks) 28, 42, 552 write file access mode 525, 526–7, 552
X XOR gates 89, 92
Z zero compression 54, 58–9, 552 zero value 325
560
457591_Ind_CI_AS & A_Level_CS_553-560.indd 560
4/30/19 8:07 AM