5.3.3
Data entry - Validation and Verification
| Revision
points: |
Candidates should be able to
- describe verification methods: double entry and visual checks;
- describe a range of validation checks and their
suitability in certain circumstances: including range checks, invalid
character checks, member lists, check digits;
|
There are many different errors that can occur when inputting data into a
system. Many are simple transcription errors (errors that occur when data is
copied from a document into a computer). To try and reduce the amount
of input errors, a system designer can build in validation and
verification checks into the software that the data is entered into.
It is important to understand that these checks cannot
eliminate all mistakes, they just try and ensure that the data is as accurate as
is reasonably possible for the purpose it is being collected for.
VERIFICATION:
This
type of data entry check is used to try and ensure that the
data is ACTUALLY correct. Since a
computer cannot do this by itself it has to involve a human in some way.
Double entry
- the data is entered twice, possibly by
two different operators. The two entries are then compared against each
other and a warning given if they do not match.
Visual checks- this is basically proof
reading, the data that has been
entered into the computer is visually checked by a human, either on
screen or from a printout, to be sure that it matches the data source.
VALIDATION:
This type of data entry check only checks that the data is
VALID, in other words it is sensible. It
does NOT mean that the data is actually correct!
Range checks - these are used with numerical fields to limit
the range of numbers a user can enter. For example:
- Imagine if you were making a database for a user to store details of
yachts that are sold in a second-hand boatyard and you know that they never
sell anything longer than 20 metres and less than 13 metres.
- It would make sense to make a range-check validation rule for the 'LENGTH'
field so that if an entry was made that was outside this range then a warning
message would pop up and the entry would not be accepted.
- A typical range-check validation rule might be: BETWEEN 13 AND 20
- This could also be entered as: >=13 AND <=20
Format checks - these
only allow valid text or numbers. Many use input 'masks' such as 000LLL.
Such a mask would only allow 3 numbers followed by 3 letters.
| Common input mask codes |
| 0 |
a digit between 0 and 9 must be entered |
| # |
an entry is optional, but it must be a digit
between 0 and 9 |
| L |
a letter between a to z must be entered |
| ? |
an entry is optional, but it must be a letter
from a to z |
| A |
a letter or digit must be entered |
| a |
an entry is optional, but it must be a letter
or digit |
Member lists
- this method of validation is used to limit entries to those that are members
of a list of allowed entries. The choices may appear in a drop-down list to
reduce transcription errors or be in the form of a list of acceptable entries
that the input is checked against.
- For example, a member list for a Vehicle Make field in a
vehicle database might be Renault, Ford, Honda etc.
- A member list validation rule such as "Renault" OR "Ford" OR "Honda" would
limit entries to members of this list.
Check digits - this type of check is used
with numbers. An extra 'check digit' is calculated from the numbers to be
entered and added to the end. The numbers can then be checked at any stage
by re-calculating the check digit from the other numbers and seeing if it matches the one entered. One
example where a check digit is used is in the 10 digit ISBN number which uniquely identifies books. The last
number of the ISBN is actually the check digit for the other numbers, for example - the ISBN 0192761501.
- The Modulus-11 system is one example of a check digit system and it can
apparently detect 99% of input errors (there
is still a small chance that more than one error occurs and the resulting number
has the same check digit).
How the Modulus-11 system works:
-
Each number
input is 'weighted' by multiplying it by it's position +1
- The weightings are adding
together (the checksum) and this is divided by the prime number 11
- The remainder
is then subtracted from 11 to get the check digit
- The check digit is then added to the
end of the number.
- To check if the number entered is correct the calculation is repeated but this
time the check digit is included in the calculation.
- Because of this the remainder should now be 0 and it it is then there is a 99% chance the
other numbers were entered correctly.
- Note: if the check digit is a 10 then this is printed as an x to keep it
to a single character.
i.e. the number 019276150 would have a check digit of 1 as
shown below.
|
|
|
|
|
|
|
|
|
|
|
Check
digit |
|
|
|
Position |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
|
|
|
ISBN |
0 |
1 |
9 |
2 |
7 |
6 |
1 |
5 |
0 |
1 |
Checksum with check digit |
Checksum / 11 |
|
(Number) x (position) |
0 |
9 |
72 |
14 |
42 |
30 |
4 |
15 |
0 |
1 |
187 |
17 remainder 0 |
- The weighted checksum without the check digit is 186.
186 divided by 11 = 16
remainder 10. 11 minus the remainder therefore
gives a check digit of 1
- When the check digit is included in the calculation in position 1
then the checksum is 187. 187 divided by 11 = 17
remainder 0 so the check digit matches the rest of the numbers which means
they were probably entered correctly.
Presence checks - these simply check that
an entry has been made in a particular field. It it has not then the system will
not allow the record to be saved or any entries to be made in later fields.
Typical validation rules
|
| Rule |
What is being checked |
Valid data |
Invalid data |
| >20 |
If a numerical entry is greater than 20 |
21 |
20 |
| >=20 |
If a numerical entry is greater than OR equal
to 20 |
20 |
19 |
| BETWEEN 20 AND 30 |
If a numerical entry is greater than OR equal
to 20 AND less than OR equal to 30 |
25 |
31 |
| >=20 AND <=30 |
If a numerical entry greater than OR equal to
20 AND less than OR equal to 30 |
25 |
18 |
| "RED" OR "BLUE" |
If a text entry is RED or BLUE |
BLUE |
BLU |
|