152 lines
6.2 KiB
Sed
152 lines
6.2 KiB
Sed
# Common bitwise operations using regular expressions


#


# Copyright (C) 2018 Mike Gerwitz


#


# This program is free software: you can redistribute it and/or modify


# it under the terms of the GNU General Public License as published by


# the Free Software Foundation, either version 3 of the License, or


# (at your option) any later version.


#


# This program is distributed in the hope that it will be useful,


# but WITHOUT ANY WARRANTY; without even the implied warranty of


# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the


# GNU General Public License for more details.


#


# You should have received a copy of the GNU General Public License


# along with this program. If not, see <http://www.gnu.org/licenses/>.


#


# This script implements the most common unary and binary bitwise operations


# on 8bit (1byte) values. The format of the input is:


#


# C BYTE[ BYTE]


#


# Where C is one of the commands defined below and, BYTE is an 8bit value


# represented by 0s and 1s. The square brackets denote an optional


# valuesome operators are unary (require one argument) and others are


# binary (require two arguments).


#


# For example:


#


# ^ 11100001 11110000


#


# will XOR the two bytes to produce `00010001'. Whereas:


#


# < 11111111


#


# will perform a logical left shift to produce `11111110'.


#


# Equality is a special operator: we first XOR the two values and then check


# to see if all bits are unset. If so, then the values are identical, and


# all bits are set; otherwise, all bits are cleared. For example:


#


# = 11100011 11100011


#


# will result in `11111111'. A nonmatch results in `00000000'.


#


# The regexes below all follow common patterns. To make that pattern clear,


# some regexes may do useless things (e.g. `.\{0\}') so that they are


# wellaligned.


#


# Transformations use `:' and `.' as intermediate values to represent 1 and


# 0 respectively. This is necessary to ensure that one regex does not


# operate on the replacement of another (for example, NOT replacing 0 with 1


# and then replacing 1 with 0 immediately thereafter; or the dualuse of OR


# with XOR).


#


# Below, A denotes the first byte and B the second. The term ``set'' refers


# to a bit with a value of 1 and ``clear'' a value of 0.


#


# (This could have been implemented more concisely by branching in a loop,


# but I want to be clear that this is being done with vanilla replacements


# using regexes and that such loops are not needed.)


#


# Note that all regexes operate at a fixed position; this makes them


# suitable as a template for generalpurpose use in larger pattern spaces.


##




# Bitwise AND (&). If the value of the bit in A is already clear, then we


# need not do anything, because the result will always be clear. Otherwise,


# we need only clear the bit if the respective bit of B is clear.


s/^\(& .\{0\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{1\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{2\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{3\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{4\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{5\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{6\}\)1\(.\{8\}0\)/\1.\2/


s/^\(& .\{7\}\)1\(.\{8\}0\)/\1.\2/




# Bitwise OR (), XOR (^), equality (=). This logic is shared for each


# operation (see XOR and equality below). If the bit in A is already set,


# then we need not do anything, because the result will always be set.


# Otherwise, we need only set the bit if the respective bit in B is set.


s/^\([=^] .\{0\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{1\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{2\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{3\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{4\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{5\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{6\}\)0\(.\{8\}1\)/\1:\2/


s/^\([=^] .\{7\}\)0\(.\{8\}1\)/\1:\2/




# Bitwise XOR (^), equality (=). We must perform two steps: first, if a bit


# in A is clear, then it should be set if the respective bit in B is set;


# this logic is handled above in OR. Otherwise, if A is set, then it should


# be cleared if the respective bit in B is also set.


s/^\([=^] .\{0\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{1\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{2\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{3\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{4\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{5\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{6\}\)1\(.\{8\}1\)/\1.\2/


s/^\([=^] .\{7\}\)1\(.\{8\}1\)/\1.\2/




# Bitwise NOT (~). This is a unary operation. A bit in A is set if it is


# clear and viceversa.


s/^\(~ .\{0\}\)1/\1./; s/^\(~ .\{0\}\)0/\1:/;


s/^\(~ .\{1\}\)1/\1./; s/^\(~ .\{1\}\)0/\1:/;


s/^\(~ .\{2\}\)1/\1./; s/^\(~ .\{2\}\)0/\1:/;


s/^\(~ .\{3\}\)1/\1./; s/^\(~ .\{3\}\)0/\1:/;


s/^\(~ .\{4\}\)1/\1./; s/^\(~ .\{4\}\)0/\1:/;


s/^\(~ .\{5\}\)1/\1./; s/^\(~ .\{5\}\)0/\1:/;


s/^\(~ .\{6\}\)1/\1./; s/^\(~ .\{6\}\)0/\1:/;


s/^\(~ .\{7\}\)1/\1./; s/^\(~ .\{7\}\)0/\1:/;




# Logical left shift (<), right shift (>). For left shifts, the first bit


# in A is discarded and a 0 is added to the end. For right shifts, the last


# bit of A is discarded and a 0 is added to the beginning.


s/^< .\(.\{7\}\)/< \10/


s/^> \(.\{7\}\)./> 0\1/




# Arithmetic right shift (a). Similar to a logical right shift, except that


# instead of shifting in a clear bit, the sign is maintained (in a two's


# complement system, the most significant bit is the sign bit). An


# arithmetic left shit is the same as a logical left shift, so we do not


# provide such an operator.


s/^a \(.\)\(.\{6\}\)/a \1\1\2/




# Circular shift (rot8) left (r), right (R). Rather than shifting in a


# clear bit, the bit that is shifted off of the end is readded to the other


# end. This is also called a rotation.


s/^r \(.\)\(.\{7\}\)/r \2\1/


s/^R \(.\{7\}\)\(.\)/R \2\1/




# Replace the intermediate values `:' and `.' with their respective


# bits. We must do this _before_ the equality check.


s/:/1/g; s/\./0/g




# Equality check (=). Since we already replaced the intermediate values


# above, we now have the final result of an XOR. If all bits are _clear_


# (A^B=0), that means A=B. Otherwise, they differ. If A=B, then we set all


# bits and clear the operator. If the first replacement does not occur,


# then the operator will still be set, and so we clear all bits in A.


s/^= 0\{8\}/ 11111111/


s/^= .\{8\}/ 00000000/




# Prepare the final output by discarding the command and second byte.


s/^. \(.\{8\}\).*/\1/




# Exit with code 1 so that the animate script knows we're done.


q1
