There has always been a blurred borderline between techniques for systems analysis and techniques for systems design. In this document analysis techniques are considered relevant for requirements elicitation and for modelling existing systems. The most appropriate techniques for modelling existing systems are Data Flow Modelling and Normalisation, however we begin by discussing the nature of corporate information and by discussing the interview as a technique for requirements elicitation.
Prior to discussing the techniques available for identifying and recording information a brief description of the nature of organisational information will be useful.
Flow of Information
Information is the life blood of many organisations. Information flows into and out of organisations and between the different levels in an organisation. Typically at the lower levels of an organisation, information is very detailed, e.g. relating to individual customers, orders, suppliers, invoices etc. As information flows up the hierarchy of an organisation it tends to become summarised. As an example consider a banking environment. A teller is interested in the specific information about the account they are currently dealing with, such as account number and balance. At the end of each day the branch manager may receive a summary report showing the total of all balances of accounts at that branch, together with a short list of those individual customers with balances of less than -£500 or greater than £5,000 (an example of exception reporting). At the end of each week the area manager or director may receive a list of customers with balances of greater than £10,000. The flow of information in an organisation is described in more detail in Unit 3 (Organisations and Information technology). It is important in this context because much of the work of the systems analyst/designer is concerned with identifying the flow and structure of information within an organisation.
Different Types and Levels of Data/Information
There is a distinction between data and information which can be simply described by the statement ‘information is data which has been processed such that it becomes meaningful’. In other words information is data which has been placed in a specific context. Consider the number 153. On its own this is meaningless data. When placed In a particular context, e.g. £153, the number becomes more meaningful because we now know that we are referring to 153 pounds. However this is still data rather than information. £153 only becomes information when you find out that it is the balance of account number 01234567.
Uses of Information
There are many uses of information :-
Standard Documents & Sources of Data/Information
There are many standard documents commonly in use in organisations, some have been mentioned already, e.g. profit and loss statements, balance sheets, bills, payslips. Others include order forms, application forms, delivery notes, invoices, business letters etc. These can provide very useful information to the systems analyst. Other sources of data include existing computer systems and their documentation, the internet, tables of data in magazines and newspapers
Information Gathering Techniques / Interviews
The purpose of an interview is to identify, how a person currently does their job (how the existing system works), the problems they face (what is wrong with the existing system) and how they would like to do their job (what is required of the new system).
A major factor in conducting interviews are the attitudes of the interviewee and interviewer. Remember that your not there to impress the interviewee with your knowledge of computers, so don’t talk about megabytes, hard disk sizes and processor speeds.
Q. Who uses computers?
A. Mothers, Fathers, Aunts, Uncles, Brothers, Sisters etc.
Think of your friends and relations, what is their range of computing experience? The people you interview may be experts or complete novices, but they are likely to be apprehensive about the impact that a new computer system will have on their jobs (will I still have a job?, will I have to learn new skills? will I have to change the way I work?). As a systems analyst you have to build trusting relationships with these people in order to get the best information from them. An open, friendly, reassuring attitude is required.
Interview essentials :-
Types of Question :-
When conducting interviews there are different types of question which can be asked and it is important to know which kinds to use :-
Try to avoid using loaded or rhetorical questions since they serve no useful purpose. Loaded questions in particular may cause offence because they imply that you already know the answer and will stick to it regardless of what the person actually says. It is also important that you avoid answering your own questions, bite your tongue! A useful approach is to use open questions to get the big picture and progressively move towards closed questions as more detail is uncovered.
Listening Skills :-
So much for questioning, the next step is listening to the answers. You need to show the interviewee that you are listening to their answers and that you are interested by :-
The objectives of this section are to: provide definitions of the terms Data Flow Model and Data Flow Digram, explain the components and representations which comprise Data Flow Diagrams, introduce a step by step procedure for creating Data Flow Diagrams.
What is a Data Flow Model?
A Data Flow Model (DFM) defines the passage of data through a system. The DFM comprises of a consistent set of Hierarchic Data Flow Diagrams (DFD)and associated documentation. slide 3
Hierarchic Data Flow Diagrams
The word hierarchy implies that there are different levels of complexity. In terms of Data Flow Diagrams (DFD) there may be up to 4 levels. At the highest level (often called a context diagram and sometimes called a level 0 DFD) all the complexities of the internal workings of a system are hidden from view by representing the entire system as one black box process which receives input data flows and transmits output data flows. The next level down consists of a single level 1 DFD which provides an overview of the 6 - 10 processes which a typical system comprises. Each process in a level 1 DFD has its own Level 2 DFD in which the process is described in some detail. Some complex systems may require Level 3 DFDs for certain level 2 processes.
What do Data Flow Diagrams Consist of?
Data Flow Diagrams at all levels consist of 4 components, i.e. External Entities, Data Flows, Processes and Data Stores.
An external entity is a person, organisation, department, computer system or anything else which either sends data into a system (sometimes called a source) or which receives data from a system (sometimes called a sink) but which for the purposes of the project in question are outside the scope of the system itself. External entities (in the SSADM scheme) are represented as ovals containing the name of the external entity and a unique alphabetic identifier. slide 5
| Tutor Guidance
A useful technique in this area which is described in Goodland and Slater (1995) is the composition and decomposition of external entities, whereby an 'organisation' level external in a context diagram, can be broken down into 'department' level externals at level 1 and subsequently 'user role' externals at level 2. This fairly simplistic approach may help leaarners to pitch their DFD's at the right level. |
A data flow is a route by which data may travel from one element of a DFD to another. Data Flows are represented by arrows which are labelled with a simple meaningful name. slide 6
Processes are transformations which change incoming data flows into outgoing data flows. Processes are represented as rectangles which contain a simple description of the process. Each process has a unique reference number. In the early stages it is possible to show where in the organisation the process takes place, however this is a physical constraint imposed by the existing system and should not appear in a completed ‘logical’ data flow diagram. slide 7
A data store is a repository for data. A data store is represented by an open ended rectangle containing the name of the data store (usually a plural noun such as customers), each data store has a unique reference number prefixed by the letter D. slide 8
How are DFDs Constructed?
Again there are no hard and fast rules and many re-drafts will be necessary as; the analysts understanding improves, new requirements are identified and the DFM is validated against the LDS. The following steps may be useful:-
| Tutor Guidance
In step 2 it maybe useful to uniquely number each data flow and consider which flows can logically be dealt with by which processes. |
This is a top down approach to Data Flow Modelling, alternatively you can work bottom up by identifying low level processes and grouping them. The approach is not important and often a hybrid approach is taken, what is important is that at the end of the exercise you have a consistent set of DFDs.
Tutorial Sheet: Data Flow Diagrams
Objectives
To model the LRC system using Data Flow Diagrams.
Scenario
The Learning Resources Centre at the University of Glamorgan (LRC) services the requirements of students, staff and external members who are allowed to reserve books and take books out on loan.
The following activities are currently supported by manual procedures;
Task 1
Establish the major inputs and outputs of the system, their sources and recipients and represent them using a context diagram.
Task 2
Uniquely identify each data flow with a number and create a table as follows: The verbs used are not intended to be prescriptive, rather to indicate the kind of verbs to be used. Woolly phrases, such as 'process' show that the level of understanding is insufficient.
| Process Number | Process Name (Concise, Descriptive) | Data Flows Supported |
|
1. |
Create | 1,2 |
|
2. |
Validate | 4 |
|
3. |
Update | 7,9,10 |
|
4. |
Collate | 3,5 |
|
5. |
Delete | 6,8,11 |
|
6. |
etc. | 12 |
NB At this stage try to use no more than 6 processes. (My guidelines suggest 6 - 10, however, it is far more likely that a process will have been omitted in the early stages and subsequently has to be included than vice versa
Task 3
Draw a first draft of the level 1 DFD ensuring that the externals are at the extermities of the diagram.
Task 4
Identify the data stores which are required to link the input and output processes and re-draft the diagram.
Task 5
Select a level 1 process and draw a level 2 DFD for it (in a real situation you would obviously have to do this for every process in the level 1 DFD).
Task 7
Review the entire DFD set against the identified requirements and redraft if necessary. A basic quality assurance check is to ensure that each data store has at least one input/update data flow and at least one enquiry/report/transmission data flow.
| Learner Guidance
If you want further information, the school of computing has developed a Computer Based Learning package supporting the Normalisation technique. This package will eventually be available through the web but for now, locate the G:\Library\CBL\Normal folder (using 'my computer' ) and double click on the Normmen3.app icon. |
Normalisation is defined briefly but accurately in the following statement;
‘The Key the Whole Key and Nothing but the Key (so help me Codd)’
Typically the literature on normalisation covers many levels of normalisation, 9 is not uncommon, but this seems to me to be a race amongst academics to identify as many levels as possible, in 99 cases out of 100, 3 levels of normalisation are all that is required.
1st Normal Form; converting an un-normalised data structure such as a report or an order form into 1st Normal Form (1NF) is commonly referred to as removing repeating groups but also may involve removing complex groups such as the Address Group described in rule 2 (see chapter 5). The aim is to ensure that each item is atomic.
2nd Normal Form; Converting a 1NF data structure into 2nd Normal Form (2NF) involves looking at each non-primary key attribute and ensuring that it depends on the whole of the key and not just part of it.
3rd Normal Form; Converting a 2NF data structure into 3rd Normal Form (3NF) involves looking at the interrelationships between non key attributes to see if any non key attributes depend only on each other.
This is all best described by looking at an example. Consider the following table which has been built up by an order entry clerk;
|
Cust# |
Name | Ord# | Date |
Part# |
Desc | Qty | Price | Supp# | Name |
|
1 |
Tim | 123 | 20/3 |
1 |
AA | 2 | 1.99 | 23 | ABC |
|
2 |
BB | 3 | 2.99 | 23 | ABC | ||||
|
3 |
CC | 4 | 3.99 | 24 | DEF | ||||
| 456 | 21/3 |
4 |
DD | 5 | 4.99 | 25 | GHI | ||
|
5 |
EE | 6 | 5.99 | 26 | JKL | ||||
|
2 |
John | 789 | 21/3 |
4 |
DD | 7 | 3.99 | 25 | GHI |
|
6 |
FF | 8 | 6.99 | 27 | MNO |
This table structure could be implemented quite easily in Cobol or in a network DBMS as shown in chapter 5, with all the associated problems.
A common representation of this kind of table in text books is as follows;
CUSTOMERS(Customer_Number, Customer_Name, (Order_Number, Order_Date, (Part_Number, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name))
The internal brackets are meant to represent repeating groups and the underline represents a primary key. This called an un-normalised or 0NF data structure.
There are two approaches converting this 0NF structure to 1NF the first involves replicating the values in the table as follows;
|
Cust# |
Name | Ord# | Date |
Part# |
Desc | Qty | Price | Supp# | Name |
|
1 |
Tim | 123 | 20/3 |
1 |
AA | 2 | 1.99 | 23 | ABC |
|
1 |
Tim | 123 | 20/3 |
2 |
BB | 3 | 2.99 | 23 | ABC |
|
1 |
Tim | 123 | 20/3 |
3 |
CC | 4 | 3.99 | 24 | DEF |
|
1 |
Tim | 456 | 21/3 |
4 |
DD | 5 | 4.99 | 25 | GHI |
|
1 |
Tim | 456 | 21/3 |
5 |
EE | 6 | 5.99 | 26 | JKL |
|
2 |
John | 789 | 21/3 |
4 |
DD | 7 | 3.99 | 25 | GHI |
|
2 |
John | 789 | 21/3 |
6 |
FF | 8 | 6.99 | 27 | MNO |
However this seems to be a clumsy approach and results in a three part key consisting of Cust#, Ord# and Part#. A simpler approach is to separate the repeating groups out into separate tables.
Step 1 remove the repeating group of orders
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date, (Part_Number, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name))
Step 2 remove the repeating group of parts
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date)
ORDER_PARTS(Part_Number, Order_Number*, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name)
The structure is now in 1NF since there are no repeating or complex group items (each item depends on the key). The next step is to convert the structure into 2NF, by examining each non primary key attribute to ensure that each depends on the whole of the key.
The CUSTOMERS and ORDERS tables each have a single column making up their primary key and are therefore by definition in 2NF. However looking at the ORDER_PARTS table it can be seen that Part_Description, Part_Price, Supplier_Number and Supplier Name only depend on Part_Number, i.e. their values are the same regardless of Order_Number. (Part_Quantity depends on the whole of the key since different quantities can appear on different orders.) To convert to 2NF a separate table is created for part descriptions, prices ,and supplier details
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date)
ORDER_PARTS(Part_Number, Order_Number*, Part_Quantity)
PARTS(Part_Number, Part_Description, Part_Price, Supplier_Number, Supplier_Name)
The structures are now in 2NF since every non-primary key attribute depends on the whole of the key. The next step is to convert the structure into 3NF by ensuring that each non-primary key attribute depends on nothing but the key.
The CUSTOMERS table is patently in 3NF because there is no non-primary key attribute for Customer_Name to depend on. The ORDERS table is in 3NF because there is no dependency between Order_Date and Customer_Number (a customer can place different orders on different dates). The ORDER_PARTS table is in 3NF because the quantity ordered is dependent on both the order number and the part number. Looking however at the PARTS table it can be seen that the Supplier_Name attribute depends on the Supplier_Number and has nothing to do with the part number. To convert the structure into 3 NF a separate table is created containing supplier details.
CUSTOMERS(Customer_Number, Customer_Name)
ORDERS(Order_Number, Customer_Number*, Order_Date)
ORDER_PARTS(Part_Number, Order_Number*, Part_Quantity)
PARTS(Part_Number, Supplier_Number*, Part_Description, Part_Price)
SUPPLIERS(Supplier_Number, Supplier_Name)
| Tutor Guidance
The best way I have come across is to use a real document such as an order form, a printed report or a screen dump of a transaction, and to make a mistake when carrying out the normalisation process. |