# Common bitwise operations using regular expressions # # Copyright (C) 2018 Mike Gerwitz # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # # This script implements the most common unary and binary bitwise operations # on 8-bit (1-byte) values. The format of the input is: # # C BYTE[ BYTE] # # Where C is one of the commands defined below and, BYTE is an 8-bit value # represented by 0s and 1s. The square brackets denote an optional # value---some operators are unary (require one argument) and others are # binary (require two arguments). # # For example: # # ^ 11100001 11110000 # # will XOR the two bytes to produce `00010001'. Whereas: # # < 11111111 # # will perform a logical left shift to produce `11111110'. # # Equality is a special operator: we first XOR the two values and then check # to see if all bits are unset. If so, then the values are identical, and # all bits are set; otherwise, all bits are cleared. For example: # # = 11100011 11100011 # # will result in `11111111'. A non-match results in `00000000'. # # The regexes below all follow common patterns. To make that pattern clear, # some regexes may do useless things (e.g. `.\{0\}') so that they are # well-aligned. # # Transformations use `:' and `.' as intermediate values to represent 1 and # 0 respectively. This is necessary to ensure that one regex does not # operate on the replacement of another (for example, NOT replacing 0 with 1 # and then replacing 1 with 0 immediately thereafter; or the dual-use of OR # with XOR). # # Below, A denotes the first byte and B the second. The term ``set'' refers # to a bit with a value of 1 and ``clear'' a value of 0. # # (This could have been implemented more concisely by branching in a loop, # but I want to be clear that this is being done with vanilla replacements # using regexes and that such loops are not needed.) # # Note that all regexes operate at a fixed position; this makes them # suitable as a template for general-purpose use in larger pattern spaces. ## # Bitwise AND (&). If the value of the bit in A is already clear, then we # need not do anything, because the result will always be clear. Otherwise, # we need only clear the bit if the respective bit of B is clear. s/^\(& .\{0\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{1\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{2\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{3\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{4\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{5\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{6\}\)1\(.\{8\}0\)/\1.\2/ s/^\(& .\{7\}\)1\(.\{8\}0\)/\1.\2/ # Bitwise OR (|), XOR (^), equality (=). This logic is shared for each # operation (see XOR and equality below). If the bit in A is already set, # then we need not do anything, because the result will always be set. # Otherwise, we need only set the bit if the respective bit in B is set. s/^\([=|^] .\{0\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{1\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{2\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{3\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{4\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{5\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{6\}\)0\(.\{8\}1\)/\1:\2/ s/^\([=|^] .\{7\}\)0\(.\{8\}1\)/\1:\2/ # Bitwise XOR (^), equality (=). We must perform two steps: first, if a bit # in A is clear, then it should be set if the respective bit in B is set; # this logic is handled above in OR. Otherwise, if A is set, then it should # be cleared if the respective bit in B is also set. s/^\([=^] .\{0\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{1\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{2\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{3\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{4\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{5\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{6\}\)1\(.\{8\}1\)/\1.\2/ s/^\([=^] .\{7\}\)1\(.\{8\}1\)/\1.\2/ # Bitwise NOT (~). This is a unary operation. A bit in A is set if it is # clear and vice-versa. s/^\(~ .\{0\}\)1/\1./; s/^\(~ .\{0\}\)0/\1:/; s/^\(~ .\{1\}\)1/\1./; s/^\(~ .\{1\}\)0/\1:/; s/^\(~ .\{2\}\)1/\1./; s/^\(~ .\{2\}\)0/\1:/; s/^\(~ .\{3\}\)1/\1./; s/^\(~ .\{3\}\)0/\1:/; s/^\(~ .\{4\}\)1/\1./; s/^\(~ .\{4\}\)0/\1:/; s/^\(~ .\{5\}\)1/\1./; s/^\(~ .\{5\}\)0/\1:/; s/^\(~ .\{6\}\)1/\1./; s/^\(~ .\{6\}\)0/\1:/; s/^\(~ .\{7\}\)1/\1./; s/^\(~ .\{7\}\)0/\1:/; # Logical left shift (<), right shift (>). For left shifts, the first bit # in A is discarded and a 0 is added to the end. For right shifts, the last # bit of A is discarded and a 0 is added to the beginning. s/^< .\(.\{7\}\)/< \10/ s/^> \(.\{7\}\)./> 0\1/ # Arithmetic right shift (a). Similar to a logical right shift, except that # instead of shifting in a clear bit, the sign is maintained (in a two's # complement system, the most significant bit is the sign bit). An # arithmetic left shit is the same as a logical left shift, so we do not # provide such an operator. s/^a \(.\)\(.\{6\}\)/a \1\1\2/ # Circular shift (rot8) left (r), right (R). Rather than shifting in a # clear bit, the bit that is shifted off of the end is re-added to the other # end. This is also called a rotation. s/^r \(.\)\(.\{7\}\)/r \2\1/ s/^R \(.\{7\}\)\(.\)/R \2\1/ # Replace the intermediate values `:' and `.' with their respective # bits. We must do this _before_ the equality check. s/:/1/g; s/\./0/g # Equality check (=). Since we already replaced the intermediate values # above, we now have the final result of an XOR. If all bits are _clear_ # (A^B=0), that means A=B. Otherwise, they differ. If A=B, then we set all # bits and clear the operator. If the first replacement does not occur, # then the operator will still be set, and so we clear all bits in A. s/^= 0\{8\}/ 11111111/ s/^= .\{8\}/ 00000000/ # Prepare the final output by discarding the command and second byte. s/^. \(.\{8\}\).*/\1/ # Exit with code 1 so that the animate script knows we're done. q1