Professional Documents
Culture Documents
Page 1 of 5
To find the last-used cell in the range you can array-enter =MAX(ROW(1:10)*(A1:E10<>"")). The part, A1:E10<>"" will return a 10x5 array of TRUE/FALSE values. This comparison could also have been done using ISBLANK function: NOT(ISBLANK(A1:E10)). FALSE means the cell is blank, TRUE means the cell has some value. By multiplying this array by the values {1;2;3;...;9;10}, which is done by using ROW(1:10), you get {0,0,0,0,0;0,0,0,0,0;0,0,0,3,0;4,0,0,0,0;0,0,0,0,0;0,0,6,0,0;0,0,0,0,0;0,0,0,0,8;0,9,0,0,0;0,0,0,0,0}. Notice the semi-colons after every 5 values, corresponding to the next row in the evaluation. The zeros are the result of multiplying FALSE by any number. If you visually scan the expansion of the array, you see the first non-zero is a 3. This is in the third row of values, in the set {0,0,0,3,0}. This corresponds to cell D3 in the range A1:E10. The 3 is the result of multiplying TRUE by the corresponding position of the ROW(1:10) -- were in the third row, and TRUE multiplied by 3 is 3. Every non-zero corresponds to a non-blanksin the range. This array is passed to the MAX function, which in this case returns a 9, giving the desired result: 9 means the bottommost row used in A1:E10 was 9.
between the first and last names. (Like BillClinton). This technique explores a way to determine which names, if any, are in error. Let B2 contain the formula =IF(NOT(ISERROR(B1)),INDEX(A1:A10,B1)&" is incorrect.","") which says that if cell B1 is not an error, then display the offending cell and the text is incorrect.. The offending cell is indicated as the INDEX(A1:A10,B1). Therefore, cell B1 must contain the location of the cell which contains an error within the range A1:A10. Cell B1 contains the formula =MATCH(TRUE,ISERROR(SEARCH(" ",A1:A10)),0). This is an interesting formula in that one does not usually use a MATCH(TRUE... in Excel. You will find that this (as well as MATCH(FALSE...) has very powerful applications. But first, lets examine the inner part of the formula: SEARCH(" ",A1:A10) is unusual in that the SEARCH function usually examines a cell, not a range. Therefore, its this usage which requires the formula being array-entered. This function might return something like {4;3;5;4;8;8;#VALUE!;6;#VALUE!;6}. The first value, 4, means that theres a space in the 4th position of cell A1. Similarly for the 3 and 5. The #VALUE! error in the 7th position of the array corresponds to cell A7, which contains BillClinton, and indicates there is no space in the cell. Similarly for cell A9. This array is passed to the ISERROR function which converts the array to {FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;TRUE;FALSE} where the TRUE corresponds to the #VALUE! errors. The MATCH function, then, finds the TRUE value in this array, and the value in cell B1, 7 in this case, is the result (the ,0 at the end of the MATCH function indicating that an exact match is to be found). If all the cells contained a space, the array would be all FALSEs, and the MATCH would return an error. Therefore, cell B2 would be blank because its testing for an error condition in cell B1.
http://www.emailoffice.com/excel/arrays-bobumlas.html
26.10.2013
Array Formulas
Page 2 of 5
If you were interested in how many times an a occurred, counting banana as having 3 as, you could use the formula =SUM(LEN(MyRange))-SUM(LEN(SUBSTITUTE how this one works. The LEN function usually requires, again, a string, not a range. So, the requirement is to array-enter the formula so Excel knows to treat each value in the range as a string for the LEN function. LEN(A1:A2) would return {3;5} if A1 contained bob and A2 contained Sally. By taking the SUM(LEN(MyRange)), we have the total number of characters in MyRange. If we subtract from that the total characters in MyRange without the letter a, then we have the number of times a appears. This is what the second part of the formula does. The SUBSTITUTE function is changing a to the null string. Within a range of cells, each is returned as an array of values, passed to the LEN function, and SUMmed up.
Cell A1 contains the text of the path to a file. The backslash is used as a delimiter, separating the various fields. Cell A1 could also contain any text, like Now is the time for all good men to come to the aid of their party where the delimiter is a space, and this technique will extract the word party, the last field in the string. The formula in cell A3 is complicated, so lets see what it needs to contain rather than just reveal it. The task would be easy if we needed to find the first backslash: we could use the SEARCH function. If the text were reversed, then we could find the first backslash. So, our job will be to reverse the text. It turns out that reversing the text in one formula is not possible (or at least I have not discovered a way!), but it is possible to reverse the text as single characters rather than as one string. For an easier example, suppose cell D7 in a different spreadsheet contained the text abc. How can we change that to the array {c,b,a}? Well, that can be done fairly easily. Since the technique which makes it {a,b,c} is by array-entering =MID(D7,ROW(1:3),1), look at what this function does: =MID(D7,4-ROW(1:3),1). By introducing the 4-, we are subtracting the array {1;2;3} from the value 4, which produces the array {3;2;1}! Now, =MID (D7,{3;2;1},1) produces {c,b,a}. Now that you see the approach, lets look a bit further. Instead of using a constant, 4, we use the LEN(D7)+1. Instead of just reversing the string, we also need to MATCH the delimiter against this string to find the first one.The formula MATCH("\",MID(A1,LEN(A1)+1-ROW(1:25),1),0) used in the illustration above becomes MATCH("\", {"f";"f";"u";"t";"S";"y";"M";"\";"S";"E";"L";"P";"M";"A";"X";"E";"\";"5";"L";"E";"C";"X";"E";"\";":";"C"},0). You can see that were searching backwards for the occurrence of the backslash. This evaluates to 8. So, what do we do with this 8? We actually want one less characters from the end because we dont want to include the backslash. We need to take the last 7 characters of the cell. That lends itself to the RIGHT function. Therefore, subtracting 1 from the above formula does the trick. Array enter: =RIGHT(A1,MATCH("\",MID(A1,LEN(A1)+1-ROW(1:25),1),0)-1). It turns out that the adding 1 then subtracting 1 do, in fact, cancel each other out, and the more compact formula is =RIGHT(A1,MATCH("\",MID(A1,LEN(A1)-ROW(1:25),1),0)). To ensure you will find the last part when the last part may be more than 25 characters, you should change the ROW(1:25) to be ROW(1:255). After explaining all that, there is a way to do it without an array formula! By using the SUSTITUTE function, you can change the last delimiter to some other character, then search the string for this character, and everything after that character is the last part. The SUBSTITUTE function has as its syntax, =SUBSTITUTE(text,old text, new text,instance #). The instance# you want is calculated also by the SUBSTITUTE function: =LEN(A1)-LEN(SUBSTITUTE(A1,"\","")) returns the number of times "\" occurs (the SUBSTITUTE is changing \ to null, so the length changes. If the string were a\a\a, then this gives 5-LEN(aaa) or 5-3, or 2, the number of backslashes. Now change the last backslash to some other string, say CHAR(13): SUBSTITUTE(A1,"\",CHAR(13),LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))). This changes the last backslash to CHAR(13). Now find it: FIND(CHAR(13), SUBSTITUTE(A1,"\",CHAR(13),LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))) which gives the position of this character, and now use it in a MID function to return the last part: =MID(A1,FIND(CHAR(13), SUBSTITUTE(A1,"\",CHAR(13),LEN(A1)-LEN(SUBSTITUTE
(A1,"\",""))))+1,255)
Granted, this formula is longer, but it calculates much more quickly than an array formula.
But how can we find that using Excel formulas? If columns A and B were combined, for example, by entering =A2&B2 in cell D2 and filling down, wed have a unique set of values to search. That is, in column D wed see Bob1;Bob2;Bob3, etc. John3 would appear in cell D9. Then we could use =MATCH(E2&E3,D2:D20,0) and find the row the result is in. Well, using array formulas, we can simulate the combining of columns A and B. Cell E4 contains the array-entered formula =INDEX(C:C,MATCH (E2&F2,$A$1:$A$22&$B$1:$B$22,0)). Since were ANDing A1:A22 with B1:B22, this gives us the array {"NameCode";"Bob1";"Bob2";"Bob3";"Bob4";"Bob5";"John1";"John2";"John3";"John4";"John5";"Bill1";"Bill2";"Bill3";"Bill4";"Bill5";"Bill6";"Bill7";"Bill8";"Bill9";"Bill10";"Bill11"}.
http://www.emailoffice.com/excel/arrays-bobumlas.html
26.10.2013
Array Formulas
Page 3 of 5
MATCHing John3 to this gives the result we want, 9. Passing this to the INDEX function gives the number we need, 158.
In the top half, we see two ranges, A1:A3 and C1:C3, and theres a 6 in both ranges. As you can see form cell B6, the TRUE indicates that there is a duplicate. The text box overlaying cells D4:H5 shows the array-entered formula, =OR(TRANSPOSE(A1:A3)=C1:C3). The TRANSPOSE function changes the orientation of A1:A3 to be horizontal, just so its the opposite orientation of the second range. By doing this, Excel will compare each value in the first array to every value in the second, not simply a one-to-one correspondence. That is, the 4 in cell C1 will be compared to {1;6;3}, the 5 in C2 will be compared to {1;6;3}, and the 6 in C3 will be compared to {1;6;3}, giving 9 comparisons in total. In general, the number of comparisons is given by the number of elements of each array multiplied together, in this case 3x3. If you dragged across the formula inside the OR function and pressed the F9 key (calculate), you would see {FALSE,FALSE,FALSE;FALSE,FALSE,FALSE;FALSE,TRUE,FALSE}, where the eighth value is TRUE because that is the comparison of cell C3 against A2. Since this array of TRUEs and FALSEs are passed to the OR function, any occurrence of a TRUE will result in a TRUE, indicating theres a match, or duplicate. If none of the values match, they will be all FALSE values, which result in a FALSE when ORed together. In the bottom half of the illustration, the comparison is already of differing orientations, so the TRANSPOSE function is not needed, and the range A9:A11 (same values as before) is being compared to C10:H10. There are 3x6, or 18, comparisons. If you F9 the formula inside the OR function, you would see, of course, {FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE}. Notice every sixth FALSE is followed by a semicolon, whereas all the others are commas. This indicates the orientation.
Cells A1:A7 are defined as range, and the COUNTIF function is used to return an array of values. Usually, the COUNTIF function takes two arguments: a range, and a criteria, where the criteria is a simple test. In this case, were using the range itself as the criteria, and therefore the function must be array-entered. What does COUNTIF(range,range) do for us? Well, each value in the range is used as the criteria in turn. When the range A1:A7, which evaluates to {1;3;2;4;3;3;4}, is used twice in the COUNTIF function, the first test is =COUNTIF({1;3;2;4;3;3;4},1) which counts the number of times the range is equal to 1. This is 1. The second test is =COUNTIF({1;3;2;4;3;3;4},3) which counts the number of times the range equals 3. The reason the criteria is 3 is that this is the second value in the range. There are three 3s. Similarly for the other 5 tests: there is one 2, two 4s, three 3s, three 3s, and two 4s. Therefore, =COUNTIF({1;3;2;4;3;3;4},{1;3;2;4;3;3;4}) produces the array {1;3;1;2;3;3;2}. Passing this to the MAX function, =MAX(COUNTIF(range,range)), returns the largest value of the array, or 3. Since this doesnt equal 1, the result is FALSE, which indicates the range is not made up of unique values. The range C8:C14 in the illustration contains the characters a-g. The COUNTIF(Range2,Range2) returns the array
http://www.emailoffice.com/excel/arrays-bobumlas.html
26.10.2013
Array Formulas
Page 4 of 5
{1;1;1;1;1;1;1} because each of the items in the criteria returns a 1: =COUNTIF(Range2,a) counts the number of times the Range2 equals a, which is 1, and similarly for b, c, etc. MAX({1;1;1;1;1;1;1}) is 1, which, when compared to a 1, returns TRUE, indicating the range contains unique values. Theres another very powerful formula which counts the number of unique values in a range: =SUM(1/COUNTIF(range,range)), array-entered. Suppose the range consisted of the 7 values {11,12,13,11,14,11,12}. This function would return 4 because there are 4 unique values. The COUNTIF part returns the array {3,2,1,3,1,3,2} where each value is the number of times the corresponding values occur. That is, 11 occurs 3 times, 12 twice, 13 once, etc. Dividing this array into 1 gives the array {.33333,.5,1,.33333,1,.33333,.5}. Each of the .33333s (or 1/3) corresponds to the 3s, and there are 3 of them. The .5s correspond to the 2s, and there are 2 of them, etc. Adding them up brings together the 3-thirds, 2-halves, etc, and therefore counts how many there are! That is, we have 3 x .33333, 2 x .5, and 2 1s. This totals 4, the number of unique values. So you could use this formula as well as the MAX function just discussed.
When you need to find a close value, youre not interested in whether the value is on the high side or low side. That is, for the average of 21.8, 22 is closer than 21 -subtracting the average from these values produces a .2 and a negative .8. Since youre not interested in positive or negative, you cant just take the minimum value, but you can take the minimum value of the absolute differences.
(Units)).
To get the absolute differences, you take each value and subtract the average, then pass this calculation to the ABS function. So far, then you have =ABS(Units-AVERAGE This part produces the array {11.8;6.8;1.2;23.2;5.8}: (10 - 21.8 is -11.8, 15 - 21.8 is -6.8, etc. The negative sign disappears because of the ABS function).
Now we have a range of differences, but were interested in the smallest of these, so this is passed to the MIN function. This returns 1.2. Now we have to find where the 1.2 is in the range, so we MATCH this MIN(ABS(Units-AVERAGE(Units))) against the range ABS(Units-AVERAGE(Units)), and the third parameter of the MATCH function, zero, requests an exact match, so we find the 1.2 in the 3rd item of the array. Pass this 3 to the INDEX function, and Excel returns the closest value. The final formula is =INDEX(units,MATCH(MIN(ABS(units-AVERAGE(units))),ABS(units-AVERAGE(units)),0)) array-entered, of course.
http://www.emailoffice.com/excel/arrays-bobumlas.html
26.10.2013
Array Formulas
Page 5 of 5
{TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;FALSE;TRUE},and when this is multiplied by Amount, it evaluates to {1;2;0;4;5;6;0;8}. When this is SUMmed together you get 26. If you had wanted to add the amounts together when both conditions are met, you could use the array-entered formula: =SUM((Name="Bob")*(MONTH(Date)<5)
*Amount).
Three cells are used to drive the calendar in the session: A1, which contains =TODAY(), B1, which contains =WEEKDAY(B2), and B2 which contains =DATE(YEAR Cell B2, then, is the first of the month of the date entered in cell A1. If, by default, A1 contains =TODAY(), then the calendar will open up displaying the current month. If you wanted any other month, you can simply enter a new date in cell A1. Cell B1 is the day of the week of cell B2, as a number, where 1=Sunday. A 6 row x 7 column selection is made and the formula =Calendar is array-entered. Calendar is =week*7+weekday+B2-B1. Lets examine how this all works. Lets assume were working with February, 98. The part of the formula =week*7+weekday will always be the same, namely the numbers 1-42. How? Week*7 is the column-oriented array {0;7;14;21;28;35}. When this is added to the row-oriented array {1,2,3,4,5,6,7}, Excel expands to give {1,2,3,4,5,6,7;8,9,10,11,12,13,14;15,16,17,18,19,20,21;22,23,24,25,26,27,28;29,30,31,32,33,34,35;36,37,38,39,40,41,42}. Notice the semi-colons after the 7, 14, etc. This indicates the next row is begins there. To this array is added B2, which contains the date for the first day of the month. For February 98, this is the serial # 35827. Now we have an array consisting of {35828,35829, , etc.}. Now the weekday of 2/1/98 is subtracted. Since 2/1 is a Sunday, the weekday is 1, and the array becomes {35827,35828, , etc.). Since these are valid serial #s, you can simply format them as d, and youll see the day of the month. Since 2/1/98 is a Sunday, the calendar starts right in with the 1 in the upper left cell of the 6x7 range. By making the row heights high and formatting the cells alignment to be vertical:top, horizontal:right, it looks like a real calendar!
A few notes, however. If we looked at November 97, for example, we would see the upper left cell start with a 26. This is because the entire 6x7 array is filled with serial numbers and the Sunday in the upper left corner is the 26th of October 97. The whole first row contains {26,27,28,29,30,31,1} and we only want to see the 1. Similarly, the bottom row contains {30,1,2,3,4,5,6} because it extends into December 97 and here we only want to see the 30. This was fixed by making the formula a bit more complicated, but makes the result just what we want:
=IF(DAY(Calendar)>15,IF(ROW()=5,"",Calendar),IF(DAY(Calendar)<15,IF(ROW()>8,"",Calendar),Calendar))
This formula is array-entered, of course, and says that if the day of the month is >15, then in the first row show blank. Lets examine this a bit more before looking at the rest of the formula. In the first row, we saw some days from the previous month (when we looked at 11/97). This was row 5 in the session. So, if the day is over 15 (15 is fairly arbitrary the first number we could see from the previous month is 26, so 25 would have worked as well), and the row is the first row, dont show it: =IF(DAY
(Calendar)>15,IF(ROW()=5, "",Calendar
Similarly, in the last row(s), we dont want to show the days from the beginning of the next month, so here were looking at IF(DAY(Calendar)<15,IF(ROW()>8, "",Calendar which effectively removes the beginning of the next month. We chose ROW()>8 rather than ROW()=10 because some months are short , like 2/98, and only take up 4 rows. In that case we want to blank out rows 5 and 6.
http://www.emailoffice.com/excel/arrays-bobumlas.html
26.10.2013