Google

星期二, 十二月 18, 2007

Learning Assembler with Delphi 5

Have you ever tried to write a bitmap painting program with that fancy brush in which you've worked out how to blend and fade colors to spectacular effect, only to be let down by the speed at which Delphi's canvas pixel array operates ? You have, well read on.

First, before we go any further please take time to fire up Delphi and make sure that 'aligned record fields' is switched on in the complier section of project options. It is ? Good, virtually every api function we shall use in this section demands that all data is double word aligned. That last two minutes has just saved you hours of debugging time.

What makes Windows so good a programming platform, is that it acts as an interface between the programmer and the hardware. You don't need to worry about what kind of graphics or sound card is actually there you just ask windows to play a wav file or display a bitmap and it does it. No more sound card crashes because you chose the wrong interrupt.

There is of course a price, to pay. That price is as ever, speed. With DirectX there are now api routines for direct screen access, however its performance is highly dependant upon the hardware fitted. Delphi's canvas is designed to hide the complexities in using the gdi to draw a square on the screen, leaving well alone the rather more esoteric nature of directX. This simplicity is once again at the price of speed. I decided some time ago that I needed a method of constructing a bitmap in memory which I could then bitblt on to a canvas. I would be able to build the bitmap using assembler routines for speed and bitblt'ing is quick. Indeed what I primarily needed at the time, was to be able to plot full 24bit color values on to a bitmap at rate of about one million pixels per second, this being for a graphical display routine. I soon discovered that attempting to unravel the torturous way in which Delphi deals with bitmaps would prove unfruitful. As such the windows gdi functions would have to provide the answer.

Windows 95 provides Device Independent Bitmaps, as it's way of allowing the direct memory addressing of bitmap pixels, together with several routines to service their use. The key function in this respect, is CreateDIBSection which not only creates the appropriate structure and handle, but also provides the address of the DIB's pixel array. Given the address and size of bitmap in memory, writing routines to manipulate the image becomes almost trivial. Functions such as StretchDIBits can then be used to map the resulting bitmap on to any canvas.

At this point you maybe feeling misled, after all this project is entitled 'fast screen plotting', and the method described does not seem direct. Look at the DIB that we are going to create, as a working area in which we shall construct an image before displaying it. Such an area of memory is often referred to as a frame buffer, and is commonly used where generating a new image from scratch is quicker than manipulating an existing image. Such is the case for real time display systems, 3D views for example. However this does not preclude us from using this approach to manipulate existing pictures. Encapsulating this frame buffer in a class definition seems appropriate giving us the following basic declaration.

TFrame = class
private
Bitmap : HBitmap;
lpvbits : Pointer;
Bitmapinfo : PBitmapinfo;
{Bitmap will hold the windows handle of
the DIB, lpvbits will be a pointer to the
DIB's pixel array, and Bitmapinfo is a
pointer to an instance of the Windows
Bitmapinfo structure in which we shall
describe the format of the DIB, ie.
width, height and color depth}
FWidth : integer;
FHeight : integer;
FSize : integer;
FLineLen : integer;
{just fields to store
values of the DIB's width, height, size
in bytes, and horizontal size in bytes}
public
constructor Create(ACanvas:TCanvas;Width,Height:integer);
destructor Destroy;override;
{the size of the DIB to be created
will be determined by the width and height
parameters, the canvas parameter determines
the palette to be used but this is best
explained in 'Create's' implementation. The
destructor must free up any memory allocated
to the DIB. The memory usage is
likely to be considerable.}
function Draw(ACanvas:TCanvas;X,Y:integer):integer;
procedure Plot(X,Y,Color:integer);
{Draw provides a method of painting
the DIB on to a Canvas. X and Y giving
the canvas coordinates of the top left
corner of the DIB after painting. Plot allows
one to set the color of a given
pixel in the DIB.}
end;



Before looking at the implementations we should consider how we are to define the DIB. I am not referring here to the windows api functions but rather the color depth of the DIB as this will dramatically affect our code. The windows routines are generic and can cope with any color depth from monochrome to true color. Consequently there is a speed penalty whilst format checking is carried out. By choosing a single color format, our routines can be optimized. In this case I have chosen 24bit true color as this met my needs at the time of writing and avoided the need for palette definitions. However for ultimate speed, 8bit color using the system palette is the way to go.

If we now look at the implementation of each object method in turn.

constructor TFrame.Create(ACanvas:TCanvas;Width,Height:integer);
var LineAdj:integer;
begin
FWidth := Width;
FHeight := Height;
FLineLen := 3*Width;
LineAdj := FLineLen and 3;
LineAdj := 4 - LineAdj;
LineAdj := LineAdj and 3;
FLineLen := FLineLen + LineAdj;
FSize := FLineLen * FHeight;
{Storing the values of width and height
in the appropriate fields is straightforward
enough, the tricky bit is calculating the
size of the DIB. Each horizontal scan line
of the DIB must be double word aligned, that
is to say, each scan line must start at an
address which is a multiple of four bytes.
Windows demands this is true and will fail to
create the DIB if it is not. Why this demand is
made is a matter of cpu architecture and
optimizing performance. This is why I asked
you to check that 'aligned record fields' is
switched on in the compiler. To calculate the
memory required to store one horizontal scan
line we multiply the width by three and then
work out how many bytes we must tag on the end
to make this value divisible by four. Summing
these values gives us FLineLen the number of
bytes required to store a single horizontal
line. The total memory used by the DIB being
the product of FLineLen and the number of
Horizontal lines FHeight.}
New(Bitmapinfo);
with Bitmapinfo^.bmiHeader do
begin
bisize := 40; {size of the bmiHeader structure}
biWidth := Width;
biHeight := Height;
biPlanes := 1; {must always be one}
biBitCount := 24; {24bits required to store each pixel}
biCompression := BI_RGB; {image uncompressed, no palette}
biSizeImage := FSize; {size of image pixel array}
biXPelsPerMeter := 0; {info for scaling when printing etc.}
biYPelsPerMeter := 0;
biClrUsed := 0; {number of colors in palatte}
biClrImportant := 0; {number of important colors in palette}
end;
{The PBitmapinfo type is defined in
Delphi's Graphics unit and encapsulates the
Windows Bitmapinfo structure itself containing
two record structures, bmiHeader and bmiColors.
The latter defines a palette, but as we are
using explicit 24bit true color values, a palette
is not required. Consequently bmiColors remains
null. The bmiHeader structure defines the size
and color usage as above.}
Bitmap := CreateDIBSection(ACanvas.Handle,Bitmapinfo^,
DIB_RGB_COLORS,lpvbits,nil,0);
{If we look at the parameters in order,
ACanvas.Handle is the handle of a valid
device context and is used to define the
logical palette of the DIB if the color
usage is defined as DIB_PAL_COLORS, it
isn't so the handle passed doesn't matter
except it must be a valid device context.
Bitmapinfo^ passes the size, format and
color data in the required structure.
DIB_RGB_COLORS defines the color usage, in this
case explicit RGB values. lpvbits is a
pointer whose value will be changed so that
it points to the pixel array of the DIB.
The last two parameters tell windows how the
memory required by the DIB is to be allocated,
in this case the values tell windows to allocate
the memory itself. It is possible to handle the
memory allocation yourself, but why bother.
The function returns a valid handle
in Bitmap if successful.}
end;



The destructor is clearly defined as follows. We just get windows to reclaim the memory allocated to the DIB and we finish by disposing of the Bitmapinfo record.

destructor TFrame.Destroy;
begin
DeleteObject(Bitmap);
Dispose(Bitmapinfo);
end;



The Draw method just uses StretchDIBits. By changing the parameters various scaling and image manipulation effects can be achieved.

function TFrame.Draw(ACanvas:TCanvas;X,Y:integer):integer;
begin
StretchDIBits(ACanvas.Handle,X,Y,FWidth,FHeight,0,0,FWidth,FHeight,
lpvbits,Bitmapinfo^,DIB_RGB_COLORS,SRCCOPY);
Result := GetLastError;
end;



Now we can get back to assembler and define a routine to change the color of a pixel in the DIB.

procedure TFrame.Plot(X,Y,Color:integer);assembler;
asm
push ebx
mov ebx,[eax].Bitmap
cmp ebx,0
je @plotdone
{if the value of Bitmap is zero
then no memory has been allocated to the DIB.
All we can do is abort the plot.}
pop ebx
push ebx
{recover value of ebx without affecting the stack}
cmp edx,0
jl @plotdone
{if X coordinate is less then zero then abort}
cmp edx,[eax].FWidth
jge @plotdone
{if X coordinate is greater then
or equal to the DIB's width then abort}
cmp ecx,0
jl @plotdone
cmp ecx,[eax].FHeight
jge @plotdone
{same checks on Y coordinate}
{we need to calculate the memory
offset of point X,Y in the DIB and then
add the memory address of the start of the
DIB to find the actual address of the point.
The offset is FLineLen*Y+3*X}
push eax
push edx
{eax = object base address, edx = X.
since we are about to use the mul
operation we must save these values}
mov eax,[eax].FLineLen
{eax = FLineLen, ecx = Y,
so we can now multiply}
mul ecx
{eax = FLineLen*Y, edx = 0}
mov edx,eax
{we need to recover the values
of X and the object base address from the stack,
so we move the value of FLineLen*Y to edx before
recovering eax's value}
pop ecx
pop eax
{eax = object base address,
edx = FLineLen*Y, ecx = X}
add edx,ecx
{edx = FLineLen*Y+X}
shl ecx,1
{ecx = 2*X}
add edx,ecx
{edx = FLineLen*Y+X+2*X = FLineLen*Y+3*X,
which is what we want}
add edx,[eax].lpvbits
{add the memory address of the
start of the DIB, and edx now holds the
actual address of the pixel X,Y}
mov ecx,[edx]
and ecx,0ff000000h
{get the current value of the pixel,
as we can only move four bytes around at a
time and the pixel color value is only
three bytes long, the fourth and most significant
byte is part of the color value of the
next pixel. Using the 'and' operation we
isolate the value of this fourth byte and
store it in ecx}
mov ebx,Color
and ebx,0ffffffh
{the value of ebx is currently on
the stack, so this can be recovered in
a moment. Having loaded ebx with the
color value to be 'plotted' we must
ensure it is only three bytes long}
xor ecx,ebx
mov [edx],ecx
{using 'xor' we combine the three
byte color value in ebx with the fourth byte
in ecx, and in doing so avoid affecting the
color of the next pixel. This combined value
is then written over the pixel address
achieving the 'plot'}
@plotdone:
pop ebx
{before exiting we recover ebx's value}
end;



So, how do we use this frame buffer. The following example function illustrates the TFrame creation, the plotting of a basic fractal effect, the display of the resulting image on a form's canvas, and finally the destruction of the object.

function Example;
var x,y:integer;Frame:TFrame;
begin
Frame := TFrame.Create(Self.Canvas,400,300);
{instansiate a TFrame object with
a DIB 400 pixels wide 300 pixels high}
for x := 0 to 399 do
for y := 0 to 299 do
Frame.Plot(x,y,x*y);
{do some plotting, incidentally the
coordinate (0,0) refers to the bottom left
of the image, not the top left
as in a normal bitmap}
Frame.Draw(Self.Canvas,10,10);
{now display the image, in this
case the top left of the image will
have coordinates 10,10 with respect
to the Canvas chosen}
Frame.Free;
{finally dispose of the memory allocated
to the object. In a real application
you are likely to create a TFrame object
at the start of processing, and only dispose
of it just before exiting to windows}
end;



The object as shown here is pretty much as I used it. The only additional function I have found useful has been GetPixel(X,Y) returning the color value of the DIB at (X,Y). This function is of course, just the Plot routine with a couple of changes. That I have not needed to draw geometric shapes, is probably a percularity of my own requirements. I suspect however, that your own circumstances may differ, so there now follow a few more routines for the TFrame object. Incidentally, as the DIB has a valid windows handle, the gdi functions can be used to draw basic shapes.

function TFrame.GetPixel(X,Y:integer):TColor;assembler;
asm
push ebx
mov ebx,[eax].Bitmap
cmp ebx,0
je @getpixeldone
pop ebx
push ebx
cmp edx,0
jl @getpixeldone
cmp edx,[eax].FWidth
jge @getpixeldone
cmp ecx,0
jl @getpixeldone
cmp ecx,[eax].FHeight
jge @getpixeldone
{we need to calculate the memory
offset of point X,Y in the DIB and then add
the memory address of the start of the DIB
to find the actual address of the point. The
offset is FLineLen*Y+3*X this is the same
as the Plot routine}
push eax
push edx
mov eax,[eax].FLineLen
mul ecx
mov edx,eax
pop ecx
pop eax
add edx,ecx
shl ecx,1
add edx,ecx
add edx,[eax].lpvbits
mov eax,[edx]
and eax,0ffffffh
{having got four bytes of data from
the DIB, we dispose of the fourth,
most significant byte, leaving just
the color value of point X,Y}
@getpixeldone:
pop ebx
end;



The next routine fills the DIB with a given color.

procedure TFrame.Fill(FColor:integer);assembler;
var X,Y,indexY,indexP,Color:integer;
asm
mov ecx,[eax].Bitmap
cmp ecx,0
je @filldone
{check DIB exists and exit if not}
mov ecx,[eax].FWidth
mov X,ecx
mov ecx,[eax].FHeight
mov Y,ecx
mov ecx,[eax].lpvbits
mov indexY,ecx
mov indexP,ecx
and edx,0ffffffh
mov Color,edx
{initialize variables X and Y act as counts,
each horizontal line is considered in turn
indexY holding the address of point (0,Y)
for a given Y. There after each iteration
adds three to this value storing the result
in indexP, each successive value corresponding
to the address of a point on the horizontal
scan line. When the count reaches zero the
line has been completed, and the next scan
line is considered by adding FLineLen to indexY
and resetting X and indexP. When Y equals
zero the fill has been completed without
resorting to multiplication}
@startfill:
mov edx,indexP
mov ecx,[edx]
and ecx,0ff000000h
xor ecx,Color
mov [edx],ecx
add edx,3
mov indexP,edx
mov ecx,X
dec ecx
mov X,ecx
cmp ecx,0
jg @startfill
mov edx,indexY
add edx,[eax].FLineLen
mov indexY,edx
mov indexP,edx
mov ecx,[eax].FWidth
mov X,ecx
mov edx,Y
dec edx
mov Y,edx
cmp edx,0
jg @startfill
@filldone:
end;



A plot function utilizing our previously defined PVector pointer in place of explicit coordinates.

procedure TFrame.VPlot(V:PVector;Color:integer);assembler;
asm
push ebx
mov ebx,[eax].Bitmap
cmp ebx,0
je @vplotdone
{if the value of Bitmap is zero
then no memory has been allocated to the DIB.
All we can do is abort the plot.}
pop ebx
push ebx
{recover value of ebx
without affecting the stack}
cmp edx,0
je @vplotdone
{if edx = V = 0 then the vector
pointer passed is undefined, so exit}
mov ecx,[edx].TVector.Y
mov edx,[edx].TVector.X
{now move the vector coordinate
values into edx and ecx and the
rest of the routine is the same as Plot}
cmp edx,0
jl @vplotdone
cmp edx,[eax].FWidth
jge @vplotdone
cmp ecx,0
jl @vplotdone
cmp ecx,[eax].FHeight
jge @vplotdone
push eax
push edx
mov eax,[eax].FLineLen
mul ecx
mov edx,eax
pop ecx
pop eax
add edx,ecx
shl ecx,1
add edx,ecx
add edx,[eax].lpvbits
mov ecx,[edx]
and ecx,0ff000000h
mov ebx,Color
and ebx,0ffffffh
xor ecx,ebx
mov [edx],ecx
@vplotdone:
pop ebx
end;



Just for fun, here is a basic airbrush routine. It is not optimized, but does illustrate how I tend to test ideas, and uses many of the ideas discussed earlier. When used, the airbrush routine produces a circle of the desired color and radius, centred on X,Y, whose effect on the original image lessens as the perimeter of the circle is approached.

procedure TFrame.AirBrush
(FX,FY,Radius,Color:integer);assembler;
var X,Y,X0,Y0,X1,Y1,Xd,Yd,R2,D2,newColor:integer;
{the variables declared are all
of the constant values which will be used
X,Y centre of airbrush plot
X0,Y0 bottom left coordinate of square
to scan = X-Radius,Y-Radius
X1,Y1 top right coordinate of square to
scan = X+Radius,Y+Radius
Xd,Yd current point being considered
R2 square of the Radius
D2 square of the distance of current
point Xd,Yd from centre
newColor holds the color value for current
point as it is being constructed}
asm
jmp @airstart
{define subroutines}
@airpointok:
{checks point Xd,Yd is valid,
if valid edx = address, if not edx = 0}
push ecx
mov ecx,Yd
cmp ecx,0
jl @airpointerror
cmp ecx,[eax].FHeight
jge @airpointerror
push eax
mov eax,[eax].FLineLen
mul ecx
mov edx,eax
pop eax
mov ecx,Xd
cmp ecx,0
jl @airpointerror
cmp ecx,[eax].FWidth
jge @airpointerror
add edx,ecx
shl ecx,1
add edx,ecx
pop ecx
add edx,[eax].lpvbits
ret
@airpointerror:
pop ecx
mov edx,0
ret
@airblend:
{takes the intensity of R,G or B, 0 -> 255,
ecx = current value, edx = new value and
blends them according to current value of
D2, the square of the distance from X,Y.
returns value in ecx}
push eax
push edx
mov eax,D2
mul ecx
mov ecx,eax
pop edx
mov eax,R2
sub eax,D2
mul edx
add eax,ecx
xor edx,edx
mov ecx,R2
div ecx
mov ecx,eax
pop eax
ret
@airstart:
{initialize all variables}
mov X,edx
mov Y,ecx
sub edx,Radius
mov X0,edx
mov Xd,edx
add edx,Radius
add edx,Radius
mov X1,edx
sub ecx,Radius
mov Y0,ecx
mov Yd,edx
add ecx,Radius
add ecx,Radius
mov Y1,ecx
mov ecx,Radius
cmp ecx,0
jle @airdone
push eax
mov eax,Radius
imul eax
mov R2,eax
pop eax
@airloop:
{start of main loop}
mov ecx,Xd
push eax
sub ecx,X
mov eax,Yd
sub eax,Y
imul eax
mov D2,eax
pop eax
{D2, square of the distance
of current Xd,Yd from centre
now calculated and stored}
call @airpointok
cmp edx,0
je @airpointdone
{now know current point
OK and have it's address in edx}
mov ecx,[edx]
push edx
push ecx
{get pixel color value and save
pixel address and color on stack}
and ecx,0ff000000h
mov newColor,ecx
{grab fourth byte of color
value and store in newColor)
pop ecx
push ecx
and ecx,0ff0000h
shr ecx,16
mov edx,Color
and edx,0ff0000h
shr edx,16
call @airblend
{recover color value but maintain stack status,
isolate Red value and shift right so that Red
intensity is in range 0->255 to keep subroutine
@airblend happy. Do same with color value to be
applied. Call @airblend to blend these color values
according to status of R2 and D2, returning
modified value in ecx}
shl ecx,16
{shift back to position
of red intensity}
mov edx,newColor
xor edx,ecx
mov newColor,edx
{update newColor}
{now do this again
for the Green values}
pop ecx
push ecx
and ecx,0ff00h
shr ecx,8
mov edx,Color
and edx,0ff00h
shr edx,8
call @airblend
shl ecx,8
mov edx,newColor
xor edx,ecx
mov newColor,edx
{and again for Blue}
pop ecx
and ecx,0ffh
mov edx,Color
and edx,0ffh
call @airblend
mov edx,newColor
xor ecx,edx
pop edx
mov [edx],ecx
{finally recover address of pixel,
and update using newColor}
@airpointdone:
{and we end with the standard
loop control checks}
mov ecx,Xd
inc ecx
mov Xd,ecx
cmp ecx,X1
jle @airloop
mov ecx,X0
mov Xd,ecx
mov edx,Yd
inc edx
mov Yd,edx
cmp edx,Y1
jle @airloop
@airdone:
end;



Implementing routines for drawing squares and circles should now be within your grasp. Triangles can be tricky though. Nevertheless you have in your hands all of the tools required.

Debugging your code
To conclude this article a few words about debugging seem in order. It is very easy to set up watches, program break's, and traverse Delphi programs a line at a time. The same is true, even when using assembler. All one needs to do, is add the four 32bit general registers eax, ebx, ecx and edx to one's watch list, and see the effect of each line of assembler. When dealing with the stack try numbering each push, giving the same number to each corresponding pop. It is usually best to do this before running the code for the first time. Where possible break down complex algorithms into small relatively simple sub-routines, and make as much use as possible of local variables. Both these courses of action will hinder your code's performance, but you are more likely to produce code that works.

And finally
Enough has been covered in this article, for you to explore the possibilities of assembler. However the majority of assembler instructions actually available to you, have been ignored. Should you wish to learn more, may I suggest Borland's Turbo Assembler, just for the manuals, and Wrox's Assembly Language Master Class, which is in my opinion the finest book of its type available. Neither of these products directly address the use of assembler in Delphi, nor in Windows 95, but both give a good grounding in assembler algorithm design.

0 条评论:

发表评论

<< 主页

辽ICP备05003652号
流风洄雪听天籁,轻云蔽日看落花

Powered by Blogger