Object Oriented Design|
[Most Recent Entries]
Below are the 16 most recent journal entries recorded in
|Tuesday, April 5th, 2011|
Leaf-only class instantiations
I recently stated a simple limitation regarding how to build class hierarchies when trying to explain an idea to a colleague and it got me wondering whether or not this limitation is a generally applicable rule of thumb: only leaf classes in hierarchies should have instances. To put it directly: I am defining a leaf class as one with no subclasses.
My reasoning is that over-riding pre-existing implementations is generally dangerous since understanding how that implementation is connected to other methods is, generally-speaking, hard to do. Leaving these interior classes as purely abstract means that they can only implement common code of all derived classes, usually the common public interface which depends on the purely abstract methods to implement subclass-specific functionality within the framework of a generally common concept.
(Note: in languages without explicitly abstract methods, such as Smalltalk and Objective-C, I assume that these defacto abstract methods are actually either empty or unconditionally throw an exception)
This forces such interactions between superclass and subclass implementations to be explicitly spelled out as part of the abstract methods' APIs.
Obvious concerns of non-private member variables are ignored by this statement as I always assume that no superclass read/write state (as opposed to write-only state which is typically only for logging, etc, and read-only state which is typically just a detail of initialization) can be directly manipulated by any subclass.
There may be cases where this approach seems heavy-weight but I am not sure I can think of cases where it is limited or wrong from a design point of view.
I further wonder if this can be justified by phrasing the idea in terms of prototype-based languages: the interior classes are merely prototype objects and the clones of those objects which specialize in specific kinds of problem are literally their sub-classes. This is probably grasping a little too far but I always try to bring these back to such class-less OO constructs. In short, it is an attempt to limit the impact of the fragile base class problem
. Obviously, any language which permits implementation inheritance will experience some degree of this problem but I am trying to find the low-hanging fruit in this suggestion.
Is there anything here or am I missing obvious wrong-ness?
...Nights Current Mood: curious
|Saturday, August 21st, 2010|
Beyond IMP-caching: Hiding Virtual Dispatch Cost on Repeated Operations
The cost of virtual method dispatch seems to be an especially problematic sticking point when it is used to access a simple implementation over and over. A familiar example is iterating a data set where the method to access the next element of a complex data model is frequently implemented via virtual dispatch (since iterator interfaces and mechanics are usually completely common with the exception of a single, usually very small, look-up method). For those of us familiar with Mac OS X, a good example of a pattern to resolve this is the NSFastEnumeration formal protocol
The pattern underlying the protocol is a simple, scalable, and relatively elegant solution to the problem of a high virtual dispatch cost becoming a multiplier in the expense of a linear operation: instead of n invocations, each with 1 result, make 1 invocation with n results. From the point of view of the CPU, it merely shrinks the size of an inner-loop by removing a logical invariant (the virtual dispatch and function call).
The implementation of a solution based on this pattern is simple: allocate a large output buffer on the caller's stack and get the one virtual dispatch to populate as much of it as possible, instead of only returning one value. Then, in the caller, loop over the buffer and operate on the iterator's output without needing to call it for each step. An additional state structure is allocated in the caller's stack frame to allow the iterator to pick up where it left off (unless the situation allows the state to be preserved within the iterator instance).
While iteration is the obvious application of the pattern, it can apply to various other problem spaces (although it is most obviously beneficial when the cost of the virtual dispatch is a substantial part of the call expense). Other examples which come to mind are numerical sequences (if computing the next value in a sequence is simple, compute the next n to avoid n-1 invocations), simple data transformations (pass in both an input set and an output set - moves the loop inside the virtual dispatch instead of including it as part of the loop's iteration cost), etc.
The idea seemed worth pointing out as I have too often seen ugly designs used to get around the feared cost of repeated virtual dispatches (or even function calls) and this pattern is a far cleaner way to see a substantial performance boost without compromising the cleanliness of your design.
Edit: After thinking through it, I see why Apple's iterator would need external state: it allows the user code to exit the iterator loop early and the next iteration attempt to know that it must start over which would require additional explicit calls (which would not work in Apple's syntactic sugar) unless the state is recreated when the next loop is entered.
|Saturday, August 26th, 2006|
|Friday, December 30th, 2005|
Making heavy abstractions easier to swallow with opaque factories
I was recently working on a project which involved mining several data structures out of a binary data file. These data structures were complicated entities which stored several attributes each and were arranged as a stream of data structures stored at some offset into the file (note that finding this offset involved several seek operations as the index was not written with easy readability in mind).
The problem we faced was that we wanted the code responsible for reading this file to be reusable and have no knowledge of the higher-level objects in the specific application (in short, we wanted it designed well). We were actually re-writing this program to replace one which had no defined boundaries between the components. To that end, the low-level reader component created instances of the high-level objects as it iterated over the stream of records. In our implementation, this would not do. We had some ideas for how to do this better:1)
Give the low-level component a very large API so that we could query each attribute by some kind of abstract record ID.PROS:
the division between the components is very easy to see and makes no assumptions about behaviour on either side of the interfaceCONS:
this would lead to very large code size to implement and work with this large interface, internal performance would be terrible since the speed of seek-less streaming reads into the file would never be realized (it would be possible to get some of this through some clever caching, but that starts to make assumptions about how the interface is being used)2)
Allow the low-level component to create its own representations of the objects and then return them to the higher-level component which would then deconstruct and reconstruct the objects as it sees fit (the simplest but loosest way of doing this is with some sort of dictionary for each record).PROS:
the interface stays nice and small, performance is still good since the low-level is permitted to read the file in whichever way is most effectiveCONS:
the data becomes so hidden that it may be difficult to notice what is wrong if a bug is found, all the unwrapping and reconstruction code is error-prone and tedious to write and potentially large, the entire stream must be read into memory before returning the collection which may be unwarranted in some applications3)
Create an interface (this was Java so I will use Java terms) in the low-level which can be implemented by a factory in the high-level and passed in. The low-level can then ask the factory to create the instances for it, as it finds records in the data stream. It will collect and return these objects created by the factory without any knowledge of what they actually are (they could also adhere to some other interface which the low-level would use to work with them but treating them as totally opaque java.lang.Object instances was perfect for our needs).PROS:
allows for a simple interface (easily one method "process(Factory)"), no assumptions are made about how either side is working, future complexities can be offloaded to the factory (configuration options, etc, need not be propagated down to the low-level if the factory knows about them and has the right exposure to apply them), the streaming nature of the storage could be leveraged by the high-level component since the factory methods were called as the data was processed (consider, for example, generating an HTML report of the contents simply by streaming UTF8 data from the factory whenever the methods it is interested in are called - the memory footprint stays only big enough to contain one record even as n are processed).CONS:
may be over-kill for simple records
Needless to say, we went with option 3 and were surprised by the later enhancements we could make and the ease in which other programs could re-use the component.Enhancements:
It became useful to allow the data file to be fractured into smaller pieces and we were trivially able to extend the factory interface to allow the higher-level to take control over the the search path and open file handling for these other files.Other components:
We were able to easily write unit tests and small helper tools built on the low-level by selectively implementing the parts of the factory interface needed for a specific implementation.
For the other OS X guys out there, this is also similar to how the entire NSCoder hierarchy works. That example is also interesting in that it uses the streaming advantage to write partially encoded data, as it runs.
|Thursday, June 2nd, 2005|
Google Launches Summer of Code
"Google Launches Summer of Code
This Summer, don't let your programming skills lie fallow...Use them for the greater good of Open Source Software and computer science! Google will provide a $4500 award to each student who successfully completes an open source project by the end of the Summer.
By pairing applicants up with the proven wisdom and experience of established prominent open source organizations (listed below), we hope to make great software happen. If you can't come up with a great idea to submit, a number of our organizations have made idea lists available."
Sounds awesome, too bad it didn't come along while I was still in school...
|Friday, April 22nd, 2005|
I just found this community. It looks inactive, but let's see. I recently read a really good comment about object on the fit-dev mailing list. Ward Cunningham was speaking about a particular aspect of the framework and had this to say:
In a little project it would be nice if the built-in types were all you needed. But in a big project, you are probably already creating infrastructure for application specific type conversion for the values used in your domain.
People who don't get WholeValue will read lots of little integers and strings from lots of little fields and let the business logic assemble them into meaningful values. But that is a foolish way to program when you have objects. Better to make domain type aware widgets store whole values into domain objects and write the rest of the business logic at that level of abstraction.
I think one reason that people don't get Whole Value is that they are database centric and their database makes them chop their domain values into lots of little integers and strings to store them. I see people refuse to use the abstracting capability of their language and then say that objects don't really work that well.
|Monday, November 15th, 2004|
Accessors and Evil
It's been about a year since I read Allen Holub's (fairly) infamous tirade on the evils of accessors
The article's heavy-handed and designed to be inflammatory, but there's a core of truth. Former procedural programmers (and VB programmers!) seem to care only about an object's attributes. O-O programmers care more about object behaviour. Case in point: at university, my systems analysis lecturer described objects as "like [database] tables with functions".
Holub's suggestion of business objects with 'drawYourself()' methods seems pretty disastrous for a large system, and his argument that "you're not really putting UI code in the business logic tier, because Swing is just an abstraction" doesn't hold water. Still, it made me think, and that's always good. How about using something like the memento pattern? How about passing attributes as Strings?
At what point does pragmatism win out over purism?
|Monday, September 6th, 2004|
Event-Based Interfaces: What's in a Name?
I was recently working on a large system for automation of remote tasks, etc, in Java. I noticed a key difference in how I named the methods in the various event interfaces when compared to my coworker and it is something I picked up from Cocoa: proper tense and order implied by the name of a method.
For example, both of us accidentally implemented two methods which were very similar (and later coalesced into one function) but had slightly different names:
His:public void onObjectStateChange(Object sender)
Mine:public void objectDidChangeState(Object sender)
I never realized the true power of how Cocoa methods are named (I am thinking of things like applicationDidFinishLaunching:
here) until I saw this difference. In the first example, the word on
doesn't really mean anything. It sounds as though the method is being called at the same time
as the change is occurring, which isn't possible.
The second method, however, specifies when the event is firing relative to internal changes to this sending object. This seems more meaningful and means less ducking out to read the method's documentation to answer the simple question of: when does this get called relative to the operation in the calling object?
This is only a small example of something I find very powerful within much of the Cocoa framework. It is also a concept which scales to event predicates in the framework such as applicationShouldTerminate:
but can go further (I just can't think of anything more complicated, off the top of my head).
Like most design principles, it isn't too ground breaking and seems small but it makes code much more meaningful.
|Thursday, July 15th, 2004|
Are there any decent (preferably freeware) UML diagramming programs out there?
I've used an ancient version of Select Enterprise at college, and it was bug-ridden torture.
|Friday, March 26th, 2004|
The issue of assigning to self
This isn't something which applies to object oriented design, in general, but it may be of interest to those of you who also use Objective-C as their weapon of choice.
A sometimes controversial issue in obj-c initializer implementation is the issue of assigning the return value of the initializer you are calling within it, to the self pointer in the instance. I can't remember the reasons for not assigning it but there are some myths (at least, I can't see how any could be true) spread through novice obj-c developers that assigning to self is never a good idea. It is my belief that it should always be assigned.
Observe this code snippet:
if (! [super init])
Now, this snippet seems to be what most init implementations look like and it does some things which could be considered bad style (via multiple returns) and does something which could be completely wrong (it doesn't consider that [super init]
could return something other than self or nil).
It may seem absurd that init could return anything other than self or nil. After all, init will return nil if some pre-condition is not met (it will release itself, too) and self is the pointer given to us by the class alloc method so why would we be changing it?
The classic example of [super init]
returning something other than self or nil is a class cluster. When a class cluster is alloc'ed, the object returned is an instance of some temporary placeholder class. When you send it an initSomething message, this temporary object actually releases itself and returns an instance of some other class which is suited to dealing with the kind of data that was passed as a parameter (or is implied by the method, in name alone).
Now, the experienced programmer will point out that, if you are subclassing an arbitrary base type which does not have documentation describing how you are to subclass it, you are asking to have strange things happen. That is true, but this is only an example. There are countless other reasons why some arbitrary initializer might want to return a different object that the one it is called on. Remember, in Objective-C, there is nothing special about init
, except that we have a convention to use it. That is a very powerful concept as long as you remember that there is nothing magical or special about it. To quote a short green guy who claims to know something: "You must unlearn what you have learned."
With this in mind, the above code would look more like:
self = [super init];
if (self != nil)
This way, if the pointer changes and the old object is freed, we will access the correct ivars relative to our corrected self pointer (the other one now points to a freed object). Also, we will return the pointer which we were given by the super call, and not the one we started with (which may be completely bogus by now).
Am I missing anything else,
|Tuesday, March 16th, 2004|
Language without control structure
One of the things I have always liked about Smalltalk is that the language has no built-in concept of control structures. That is to say that structures such as if then else
and various types of loops are not special, within the language, as they are within imperative languages. Specifically, I will be talking about the ifTrue: and ifFalse: messages and how they are actually directly reducible to the control flow structures in the lambda calculus model of computation.
First of all, one must realize that Smalltalk is a purely object oriented language with purely dynamic types. That is to say that the language consists of essentially three things: classes, objects, and messages (self people will claim that having classes makes Smalltalk not pure but you get the idea). Given an object, a message sent to that object (potentially having other objects as arguments) will return an object (or nothing). This paradigm is very powerful and is even used to get basic control flow without extending the language. Observe this snippet for an example:
(5 < 7) ifTrue:[self doSomething] ifFalse:[self doSomethingElse].
In this example, the object 5 (an instance of the Integer class with value 5) is sent the < message with the argument 7 (another Integer object). The result will be an object of type Boolean (an abstract
class - although such things don't exist in Smalltalk, in the loaded sense of the term). In this case, an instance of the Boolean subclass True. Boolean objects have the message ifTrue:ifFalse: as well as the conveniences ifTrue: and ifFalse:, alone. These messages take code blocks (a special type of Smalltalk argument which is interpreted by the VM using normal order reduction (lazily)) and the True subclass overrides the message to invoke the argument to the ifTrue: segment while the False invokes the ifFalse:. As a result, control flow occurs without an actual grammar rule for it.
I always thought this was pretty neat and a good example of how a powerful paradigm can scale. Smalltalk is also interesting in that doing so doesn't make the code hard to understand or excessively long. Something I only found out recently, however, is that this is exactly how control flow is done in the lambda calculus (however, it uses applicative order reduction (eager)). Observe this example of the above operation (I will call the method doSomething A and doSomethingElse B):
(LaTeX notation, since that is as good as any)
(\lambda x. \lambda y. x) A B
In this example, I just hard-coded the comparison of (5 < 7) as true (\lambda x. \lambda y. x) since programming in the lambda calculus is ugly as hell and would have required me writing some really complicated predicate.
In both the lambda calculus example and the Smalltalk example, we see that the value of the predicate does the work of selecting whether we execute the true branch or the false branch of the condition. I found it interesting to notice this since I normally wouldn't have expected a very high level object oriented language to mirror a model of computation (a functional model of computation, at that) in such a direct way.
Any other observations?
...Nights Current Mood: surprised
|Tuesday, February 3rd, 2004|
Where Constructors are not Special
I recently finished a CS assignment for which my group decided on Java as our implementation language of choice (knowing my group members, it was either that or C++). I found some inherent design limitations, however, as a result of how Java (and, similarly, C++) treats constructors.
I have become accustomed to the way that purer
OO languages (SmallTalk and Objective-C, for example) treat constructors and how that can be used as a form of scalability in design.
Why Java and C++ have limitations:
Using the approach of a constructor being special
costs a class the ability to make intelligent decisions. This means that a constructor must
give the caller an instance of the type that the user specified or throw some kind of exception to fail. This means that design paradigms such as a class cluster
cannot be implemented without some kind of convention to ask the newly-constructed object to recreate itself as something more specialized. This may not seem like a problem to developers who are not used to how more generic OO languages work. However, I will outline the advantage from the other side, below.
How purer OO languages leverage this difference:
In a language such as SmallTalk or Objective-C, a constructor
does not exist as some special entity. To instantiate an object, you ask the class to give you an instance. The important thing to note here is that it doesn't need to give you an instance of itself. This gives rise to the approach of passing this class method some arguments which describe the object you want, and it gives you a subclass of itself which is best suited to your needs.
This may not sound like a big deal except that the code asking for the new instance doesn't need to know anything about these subclasses. The only thing it needs to know is that the new instance should respond to the methods that the parent object claims it will. Thus, the user code always treats the new instance as a member of the class that it wants to think it is. Note that it may be something totally bogus but that isn't really a problem because it requires someone going out of their way to make meaningless code (and, in the case of Objective-C, ignoring a bunch of compiler warnings).
Objective-C's spin on this: The Class Cluster design pattern
What makes this even cleaner (in that it moves more decision making to safe
code) is the approach that Objective-C takes with two-step initialization. For those of you not familiar with Objective-C, most objects are initialized by first asking the class for an instance and then asking that instance to initialize itself (the initializer just returns the instance). This is useful because the allocation class method returns an instance of a temporary class
which knows how to do nothing except for receive these initialization methods which, in turn, cause it to return yet a different object (traditionally, another sub-class of the parent). Thus, when the dust has cleared, each object serves some purpose: the parent defines an interface and implements common code, the temporary class knows how to decide which one of its siblings to create, and the other sub-classes can focus on handling their specific formulations of the cluster
This is part of the reason why complaints like why can't I just say "[MyClass new]"?
(as stated by many starting Objective-C developers) tend to go away once the developer becomes used to the immense power (and elegant simplicity) of this approach. The new
class method is provided to simply call [[MyClass alloc] init]
but its use is often viewed as being similar to the training wheels
on a bike and is discouraged.
What this really means:
This approach allows any class to grow into a class cluster without breaking code (since the knowledge of the cluster is limited to the class cluster, itself.
It is a good reason (among the countless others) to not try to shoe-horn concepts from other languages into another one.
(sorry if this is a little meaningless, I haven't slept in quite some time) Current Mood: exhausted
|Sunday, January 25th, 2004|
A design that Apple uses everywhere in Cocoa's Foundation API is that of Class Clusters
. Apple's documentation does a pretty good idea of explaining it so I will just evangelize a bit, here.
The basic premise is that, not only should an object understand itself and how to operate, abstractly, on data, but it should also understand how to create something which is better suited for the task. It means that the user of the cluster never thinks about it as anything other than one class even though they never actually are working with an instance of that class. In fact, the object that they are using will probably, at some point, be one of two different objects, neither of which are the object that they think it is.( My story of Class Clusters saving my assCollapse )
I definitely recommend this to anyone working on a project where the details of model objects should be abstracted. However, it is hard to implement in anything but Objective-C or SmallTalk since they keep the concept of object initialization so general. It can be done in other languages, of course, but you get a lot of simplicity from two-stage object construction.
Back to work,
|Thursday, January 22nd, 2004|
Lately I've been finding myself less and less interested in MVC. It's a good pattern, a useful pattern, and definitely the sort of thing one wants to make use of (especially if it's used by one's framework-of-choice, like Cocoa), and I doubt I'm going to be giving it up. It's just not interesting to me any more.
interesting (for the moment) is the addition of filters to the mix. Filters are short-lived objects which serve to transform one kind of value to another kind, or perform some other transformation. You don't necessarily have to have them as objects; methods (perhaps added by a category) are just as worthy of the name. Regardless, the mechanism is less interesting than the filtering itself.
Mac OS X 10.3 adds a system of bindings to Cocoa which let you do a lot of UI updating (and the like) via Interface Builder. They provide classes like NSObjectController, NSArrayController, and NSUserDefaultsController to let you bind values to arbitrary objects. For instance, you can connect an NSArrayController to e.g. an array owned by a custom view object, and then bind a table view to the array controller to list the items in the view, all without adding any further code.
Of course, you're not always going to have values in the format you want. Because of this, we've got a set of subclasses of NSValueTransformer which provide an interface for transforming arbitrary values, and sometimes for transforming them back again.
It's not the first time filter objects have been used, but it's a good example of how they can be put to use to good effect. Personally, my favourite transformer is the NSUnarchiveFromDataTransformer; using it and the controller layer, I made a small program without writing any code at all. That's more of a testament to Interface Builder than to filters in general, but nonetheless, they're a good pattern to put to good use.
Now if only Objective-C/Cocoa had lexical closures...
|Wednesday, January 14th, 2004|
Not too useful, but an observation
Last night, while trying to think of something interesting to post here, I was thinking of MVC, again. Anyone who knows me will know of my various concerns with the MVC paradigm since there are a few conflicting interpretations out there, and they are probably causing a lot of unneeded aggravation. However, that is a topic for another post (or 10).
I was thinking that MVC is actually a representation of a Turing Machine: there is a Controller (input tape), Model (internal state), and View (output tape).
Although I mostly think that this is nothing more than an interesting observation, it could have meaningful uses. For example, something not defined by any interpretation of MVC is the issue of inter-connection between components beyond the one set. Most of the time this can be one Controller working with several Models and Views. I wonder if this idea of TMs could be applied to design the relationships between many Controllers. That would mean that Controllers would have to be set up as if they were the Views for other Controllers that they were interested in (the same way that the output tape of one machine is made as the input to another).
I don't really think that this is the best way to think of it, but it seems interesting that something as abstract as a design paradigm may define enough to be reducible to the fundamental component of logic (a TM).
Just something to think about,
...Nights Current Mood: curious
|Tuesday, January 13th, 2004|
Welcome to the community for Object-Oriented Designers and software development-related discussions.
I created this community in response to my feeling that, although there is a great deal of attention given to programming languages and general coding, there is very little given to design. I think that design cannot be taught in the same way that general programming is taught, so this is a place where we can all come together to try to help each other out.
I hope to hear some interesting ideas which usually would go without a venue,
...Nights Current Mood: curious